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In an attempt to overcome the weaknesses of the traditional school 
organization many progressive schools have developed new programs. 
These programs are so similar in character that collectively the 
changes have been referred to as the activity movement. This 
movement has claimed the center of the educational stage for a length 
of time sufficient to have engendered widespread interest in its out- 
comes and in its basic philosophy. 

In Doctor Hissong’s study an attempt has been made to discover 
the principles underlying the present activity movement, to determine 
the influence of traditional concepts in shaping the trends of the 
movement, and to see if in the light of the present knowledge of the 
child and his relation to his environment the movement rests upon a 
justifiable basis. 
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OBJECTIVITY AS A CRITERION FOR ESTIMATING 
THE VALIDITY OF QUESTIONNAIRE DATA 


FRANCIS F. SMITH 
Fresno State Teachers College, Fresno, California 


To what extent do respondents agree with one another? In many 
questionnaires calling for factual data, this is not an appropriate 
question, since it is expected that facts which are personal, social, or 
professional in nature, for example, will be unique and different for each 
person, except by coincidence. In other questionnaires, however, such 
as those asking for opinion, judgment, evaluation, allocation, and so on, 
secured through rating, ranking, choice from pairs or groups, or other 
psychophysical devices, it may be important to check on the agreement 
of individual with individual as a means of arriving at some estimate 
of the validity of the data. 

In addition to asking the degree of agreement of individual with 
individual, it is also especially important to ascertain the agreement 
of group with group, since in composite responses, individual idiosyn- 
crasies may cancel one another, variable errors i»: one direction cancel- 
ing variable errors in the opposite direction. I ‘act, it will be shown 
that the average inter-agreement of group with group is nearly always 
higher than average inter-agreement of individual with individual. 

While neither high agreement among individuals nor high agree- 
ment among groups constitute complete proof of validity, both are 
commonly accepted as evidence of validity. 

In questionnaire investigations, when individuals or groups agree 
highly, it is a fair assumption that they are using the same or similar 
standards or the same mechanical aids to judgment in arriving at their 
conclusions. Hence the conclusions which they make may be said to 
have been arrived at objectively, and the data which they furnish 
may be said to be objective. On the other hand, if the agreement is 
low, it seems reasonable to assume that each respondent must have 
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used his own private or subjective standards in forming his conclusions, 
and his data are likely to be subjective. 

One of the most fruitful means of finding the extent of the agree- 
ment of individual respondent with individual respondent is to analyze 
questionnaire studies which require a ranking of items. By using 
Kelley’s average intercorrelation formula! it is possible to calculate 
the average agreement of individual with individual in such studies. 

Following are eleven studies making use of this formula. The 
number of theoretical correlations between respondent and respondent 
resulting from these studies equals forty-seven thousand nine hundred 
ten; it should, therefore, be possible to get a good idea of about how 
well one respondent agrees with another in such studies. The plan 
to be followed for these studies is to give a brief description of each, 
under the captions of Study 1, Study 2, etc., and then to summarize 
the results in Table I. 

Study 1.2—Fourteen students in a college class in civic education 
ranked thirty pictures of homes, ranging from two-room to about 
twelve-room houses, in order of beauty. On the backs of the pictures, 
for purposes of identification, were code numbers selected at random. 
The directions were, ‘‘ Rank these homes from most beautiful to least 
beautiful giving the most beautiful a rank of 1, the next most beautiful 
arank of2,andsoon. Think of beauty only in terms of the impression 
which you think this home would give from the street.”” No standards 
for judging homes had been given or discussed; the only conditions 
which preceded the experiment was a general discussion of the topic, 
“‘Beautifying Our City,” and the part which attractive homes played 
in such a project. The average intercorrelation in this study is 
0.047, this being practically nothing. The respondents had no 
common standards by which to make their judgments, with the 
result that there was no agreement among them. 

Study 2.—Twenty-eight teachers were asked to rank seven courses 
in Education in order of the practical value which they thought that 
these courses had had for them in helping them to succeed in teaching, 
giving a rank of one to the most valuable, a rank of two to the next 
most valuable, and so on.* The average intercorrelation in Study 2 





1 Kelley, Truman L.: Statistical Method. New York: The Macmillan Company, 
1933, pp. 217-221. 

2 When the footnote references are not indicated, the investigation was con- 
ducted originally by the author. 

3 The author is indebted to Dr. T. L. Nelson for these data. 
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is 0.733, indicating fairly high agreement among the respondents, 
presumably due to their having more definite ideas concerning the 
items ranked and to there being only seven items to be ranked, a 
relatively simple task. 

Study 3.—Theisen' asked five hundred thirty-one respondents 
interested in educational administration to arrange nineteen duties 
performed by city boards of education, in order of the importance of 
these items, giving rank one to the most important and rank nineteen 
to the least important. 

Study 4.—Thorndike? asked seventy-eight teachers, principals, and 
superintendents of schools to arrange eleven studies and activities 
“according to the amount of disciplinary value (per hour of time spent) 
to you at the age of fourteen to eighteen.”” The study or activity 
having the most value was ranked one; the next most value two; 
and so on. 

Study 5.—Six college students in the author’s class in statistics 
were asked to rank ten Americans who had been mentioned the 
greatest number of times by several classes in civic education as the 
‘greatest Americans.’”’ The American whom they deemed the greatest 
was to be given a rank of one, the next greatest a rank of two, and so 
on. 

Study 6.—Study 5 was repeated the following year by ten students 
in the author’s class in statistics. 

Study 7.—Fifty-three eighth-grade and low-ninth-grade girls were 
handed an alphabetically-arranged list of sixteen books and asked to 
write one after the book they liked best, a two after the book they 
liked next best, and so on until they had written a number opposite 
every book. 

Study 8.—The list of books used in Study 7 was handed to thirteen 
college students in a class in educational measurements, with the 
directions, ‘“‘ Arrange these books in the order in which you think that 
girls between the ages of fourteen and sixteen would rank them for 





1Theisen, W. W.: The City Superintendent and the Board of Education. 
Teachers College Contributions to Education, No. 84. New York: Bureau of 
Publications, Teachers College, Columbia University, 1917, pp. 26-29. 

* Thorndike, Edward L.: ‘‘The Disciplinary Values of Studies.” Teachers 
College Record, Vol. XXV, March, 1924, pp. 134-136. 

The tables in Theisen’s and Thorndike’s studies were reported in such form 

that Kelley’s formula could easily be applied to them. The average inter-corre- 
lations were calculated by the author of this article. 
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interest—in the order in which they would say that they like them. 
Give the book they would like most a rank of one, the other they would 
rank next a rank of two, and so on.” 

Study 9.—Study 8 was repeated the following year, the ranking 
being done by twenty-two students in the author’s class in educational 
measurements. 

Study 10.—Seven students in the author’s class in statistics were 
handed an alphabetically-arranged list of seventeen occupations and 
asked to respond to the following directions: ‘‘If your fairy godmother 
told you that you might choose any vocation you liked from the 
following list, and if she also told you that whatever position you 
chose, you might attain highest eminence—be the most successful 
in your line—which would you take as your first choice, as your second 
choice, and so on for the seventeen.” 

Study 11.—Twenty students in a college class in civic education 
were asked, at the end of the semester, to rank one another from the 
standpoint of scholarship in that class. The class has been instructed 
by the project method, so that during numerous trips to places of 
civic interest, and in the course of several individual and group projects, 
these students had had an unusually good opportunity to become well 
acquainted. The student who had contributed most to the success 
of the semester’s work was to be given rank one; the student who had 
contributed next most, a rank of two; and so on. 

The important results of the eleven studies are summarized in 
Table I. 

These eleven studies, dealing with different types of subject-matter, 
and representing respondents ranging from school children to profes- 
sional adults, show a striking similarity in the extent of inter-agreement 
among respondents. Studies No. 1 and 2 indicate the extremes of 
this agreement. The former called for a ranking of thirty items, and 
for this task the respondents had little or nothing in the way of 
standards upon which to make intelligent judgments. The latter 
asked for a ranking of only seven items; on these the respondents 
seem to have had much in common upon which to base their judg- 
ments. The remainder of the studies seem to show what may usually 
be expected in questionnaire studies calling for a ranking of items; 
namely, little agreement among individual respondents. 

Averaging the eleven studies results in an average intercorrelation 
for the forty-seven thousand nine hundred ten theoretical correlations 
of 0.362. How good is a correlation of 0.362? The answer is found 
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in the coefficient of alienation,! the formula for which is +/1 — r?. 
Substituting in the formula and solving, gives 0.932. Subtracting 
0.932 from 1.000 gives 0.068, or nearly seven per cent. If the investi- 
gator wished to know the order in which one respondent would rank 
the items to be ranked, knowing the order in which another respondent 
had ranked them, his prediction from this correlation would be about 
seven per cent better than chance—seven per cent better than a mere 


TaBLE I.—Summaky oF StupiEs 1 To 11, SHOWING THE EXTENT OF AGREEMENT OF 
INDIVIDUAL RESPONDENT WITH INDIVIDUAL RESPONDENT 











Theoretical 
number of 
Study number _Number of | Number of intercorre- Roun ll 

items ranked | respondents lations tien 

N(N —1) 

2 
1 30 14 91 .047 
2 7 28 378 . 733 
3 19 531 42,480 .325 
4 11 78 3,003 . 185 
5 10 6 15 . 455 
6 10 10 45 .435 
7 16 53 1,378 . 348 
8 16 13 78 .410 
9 16 22 231 . 383 
10 17 7 21 .319 
11 20 20 190 . 340 
6 sos wie eee whe oe 782 47,910 

Ns isn ek ww tech 15.6 ae see . 362 

















guess. Evidently, individual does not agree with individual to any 
marked degree. 

An interesting point in connection with Table I is that the correla- 
tion between the number of items ranked and the average inter- 
correlations is —0.74. As might be expected, other things being equal, 
the longer the list of items to be ranked, the more difficult the task of 
ranking and the less the agreement among the respondents. 

Is any higher individual inter-agreement obtained when some 
technique other than ranking is employed? The next five studies 





' Kelley, Truman L.: Op. cit., pp. 173-174. 
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will report the results when paired-comparison, rating scales, and allo- 
cation of points from a possible total or scoring, are used, respectively. 
In a paired-comparison study in civic education class, similar to 
Study No. 11 above, twenty-two college students were handed a 
schedule on which every student in the class was paired with every 
N(N — 1) 
9 ) 
or two hundred thirty-one pairs. Each student’s name appeared 
first in the pair half the time and last in the pair the other half. The 
order of pairs was random, no student’s name being allowed to appear 
twice in succession. The students were asked to underline the name 
of the person in each pair who had contributed most to the success 
of the semester’s work. Each student signed his schedule. The 
number of votes given by him was tabulated in such a way that the 
amount of agreement of each student with each other could be cal- 
culated. The distribution of the correlations is shown in Table II. 





other. Ina class of twenty-two students, this would make 


TABLE II.—FREQUENCY DISTRIBUTION OF INTER-CORRELATIONS AMONG STUDENTS’ 
RatTiIncs oF ONE ANOTHER BY THE PAIRED-COMPARISON TECHNIQUE 


CoRRELATIONS FREQUENCY 
a te ee ie ae eee 1 
ERENT DR 8 ra ae RE eM Arg err ee 1 
ee ool fo oe ue ou henebaevaw bawdee ess 2 
ie ce eee dian ues oueee se ae ke NCu we ae 12 
rae Wee oe ae cae Olen cc cue ee aa ae eee 7 
ES EES SS er or ORL Oe RE LS Re Tee Oe 30 
I a a es ha 38 
Teen ee A ee ee Ce eke died ae hoe 46 
Rh, bs ok ae bed waod 0a dees eR ease baens 37 
eS eee OL. Cue Sewer es bse 400 bb SK SE Rees SORES ODA 22 
EET LEE ELCEL YT CULEREPEOEE CET ETE CTE PCT EIT Te 19 
I Re ek ees Cl) ek 9 
RENE Ca. 2g RS ee pn PP Se ey Sa 5 
I be ee oes See a i a ls ea ae 1 
90 to eciee bb eae RSE ee CK SOC Ok Oa be ESC E we eRS Ow 1 
A vase a cb bcls bOR MALE A KORA oaS ok ba CdendeUde dea cGee 231 
MCE Sith a dthabeseoee le etaddieeeecsaen aes 255 
a a a a a 229 


While the paired-comparison technique has some advantages, from 
the standpoint of the ease with which the respondent may make his 
judgments, the results of this last study are about the same as the 
results of the studies involving ranking. 
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In a study of the rating of teachers by students, Guthrie reports 
an average intercorrelation of student with student of 0.262. He 
summarizes his findings as follows: 


. . . the correlation of a teacher’s position on the first card on which his name 
appeared with his rank position on the second card, the third rank with the fourth 
rank, and so on. This correlation was .262. In other words, knowing the rank 
assigned to a teacher by one student, the rank assigned him by any other single 
student could be predicted with an error of ninety-six per cent of the error of a 
pure guess.’ 


Three studies of ratings of cadet-teachers by supervisors and by the 
cadet-teachers themselves by means of score cards give further evidence 
on this low agreement of individual with individual. The score cards 
used were Boyce score cards, made up of forty-five traits, and a small 
score card consisting of fifteen traits. The former was used in a small 
city school system; the latter in a large city school system. The 
correlations show the extent of the agreement of the supervisors and 
the cadet-teachers on ratings of the latter. In the latter study there 
are two sets of such correlations; the cadet-teachers were scored 
twice and they scored themselves once.? The results are shown in 
Table ITI. 

The results of the last five studies are strikingly similar to the 
previous eleven—they all show low inter-agreement among individual 
respondents from every angle from which the problem of inter-agree- 
ment has been attacked. 

Another conclusion is evident in Table III. Some supervisors 
attain higher objectivity than others. This difference in objectivity 
is also shown in comparison of the differences in average agreement 
which the students in the paired-comparison study achieved. This 
was calculated from a table of correlations not included in this study. 
The lowest average agreement of one student with all the others was 
0.088; the highest was 0.415. Some respondents are more dependable 
than others. Not only can they stick to their own stories better, 
but they are more consistent in their agreement with other people 
than are other respondents. It seems entirely possible that if respond- 
ents were selected for certain types of studies, especially those involving 
judgment, on the basis of their relatively high reliability and objec- 





1Guthrie, E. R.: ‘‘Measuring Student Opinion of Teachers.’”’ School and 
Society, Vol. XXV, February 5, 1927, pp. 175-176. 

* The author is indebted to Mr. Rudolph Lindquist and to Mr. W. T. Helms for 
these data. | 
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tivity the validity of such data might be improved, to the extent that 
reliability and objectivity are evidences of validity. 

An interesting contrast to the low agreement found among respond- 
ents in the previous studies is disclosed in a study in which all the 
respondents were experts. All respondents had acquired common 
standards upon which to base their judgments. This is a study in 


TaBLe IIJ.—AGREEMENT OF SUPERVISORS AND CADET-TEACHERS ON SCORE-CARD 
RATINGS OF THE LATTER 

















Frequencies 
Correlations Large city 
; Small city 
First study | Second study 

— .60 to —.69 0 0 1 

—.50to —.59 0 0 1 

—.40 to —.49 0 0 l 

— .30to —.39 1 2 0 

—.20 to —.29 1 3 3 

—.10 to —.19 3 0 3 

.00 to —.09 2 3 3 

.00to .09 9 4 11 

.10to .19 5 7 12 

.20to .29 13 12 8 

.30to .39 13 11 13 

40to .49 8 15 14 

.50to .59 3 9 7 

.60to .69 4 i) 5 

.7to .79 1 5 3 

.80to .89 3 2 2 

TENE Se Se Ne gg GAT eS to 66 82 87 
Mean correlation................. .29 . 36 .27 
ilk dota a ici oy se acne Bale Wash .25 .27 .29 














which twenty students of educational administration each scored a 
high-school building, using the Strayer-Engelhardt Score Card for 
High School Buildings. Through class discussion and readings, all 
had become familiar with the standards and had had some practice 
in their use. The inter-agreement among the scorers is shown in 
Table IV. The variables in the calculation of the correlations were 





1 The author is indebted to Professors Frank W. Hart and L. H. Peterson and 
their students in educational administration for the data used in this study. 
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the scores assigned by each scorer to the sub-items A, B, C, ete.; 
the totals under the major headings I, II, III, etc. were not included 


in the calculations, since this would have made the correlations 
artificially high. 


TaBLE IV.—INTER-AGREEMENT OF SCORERS OF A Hiau ScHoout BuILDING WITH 


THE AID OF THE STRAYER-ENGLEHARDT Score CarRD 
CORRELATIONS 


FREQUENCY 

i i eh eee AE Ree ai ke coe tbe ae ed adaie weet h 3 
iS is Cr. oe oe ees ek Sek ue Os bdesednb decors 6 
i. ck ee read Ger sh ec bs ceeees oeanaseuee ees 16 
RCS, file bh Edita. Gis abe s ke Ses buat one eS a 30 
i a ot aia wn any ewes cuca aaeiaben 61 
tn tei et en eee ene éebh hh eee senabbe dean wwe 46 
i a a a be he i ke ok ae ie 27 
oI os Sle a ee eal ose hebee eee e kee a 1 

ia Bae Ne ee a wae awe aunt cebuw ete 190 

EEE ON eee Pe Ee eee . 956 

REE ee Tene prank eee ee a ee .015 


The average intercorrelation shown by the first eleven studies 
reported in this investigation was 0.36; the average of the next five 
was 0.29. These are very low compared to the average intercorrelation 
shown by the scorers of the school building, which was nearly 0.96. 
It seems clear that in many studies involving opinion, judgment, 
evaluation, and other such subjective means of determining a response 
—studies in which the respondent is asked to make a response without 
the help and control of standards or mechanical aids—the inter- 
agreement among the respondents is low and the variability of that 
agreement is wide. The inter-agreement may be made very high and 
the variability made almost negligible by using trained respondents 
who have acquired common standards or who make use of mechanical 
aids to help them to make their decisions. 

The latter procedure has always been characteristic of science. 
Ask a room full of people, for example, to tell you, as nearly as they can 
from where they sit, how long the room is from front to back. To one 
who has never tried this before, the variability of response is astound- 
ing. Now hand each person a yard stick and ask him to measure the 
length of the room from front to back; there will still be variability of 
response, but it will be very small compared to the variability in the 
first case. A mechanical aid to judgment reduced the variability of 
response. 

It is fairly safe to say that if twenty students of educational admin- 
istration were handed a list of items, similar to a teacher’s score card, 
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and asked to allocate one thousand points to these items in the same 
loose way that a supervisor allocates her points to a set of very general 
and undefined traits, the inter-agreement would be very much lower 
than it was in the above study. Direct and control their observations, 
however, by training them in the use of specific and definitely-defined 
standards, and you get superior responses. The judgments and 
conclusions of experts have always been considered, among well- 
informed people, superior to the judgments and conclusions of Tom, 
Dick, and Harry. In many questionnaires it would be wiser to get the 
responses of a few carefully selected experts than to broadcast the 
questionnaire, hoping thus to get valuable and valid data upon which 
to base conclusions. 

There is, however, another way of increasing objectivity. Even 
though the inter-agreement among individual respondents is low, it is 
possible to combine the responses of these individual respondents and 
get higher agreement among the groups made up of these individuals 
than among the individual members themselves. In a group of one 
hundred respondents, the average agreement of one individual with 
another may be represented by an average intercorrelation of 0.30, 
but if the responses of fifty of these respondents are combined into 
one composite and this composite correlated with the composite of the 
second group of fifty, the agreement of one group with the other may 
be very high. In fact, the agreement between these two groups may 
be about as high as the agreement of one of the experts in Table IV 
with another such expert. It is possible to demonstrate the truth of 
this statement in two ways. The most direct way is to combine 
such responses into composites and then calculate the correlations 
between the composites. Another way is to calculate the theoretical 
correlations from the Spearman-Brown formula.' If the average 
intercorrelation (that is, the average agreement of individual with 
individual) were 0.30, and the investigator wished to know how high 
would be the correlation of one group of fifty with a second group of 
fifty, by applying this Spearman-Brown formula he would get .955. 
This is practically the same as the average inter-agreement of the 
scorers of the high-school buildings reported in Table IV. 

In order to see whether the actual correlations of composites would 
agree with the theoretical correlations as calculated by the use of this 
formula, the actual correlations were calculated, in three different 





1 See Kelley, Truman L.: Op. cit., pp. 205-208, for a discussion of this formula. 
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studies, between composite scores made up of the sums of the responses 
of two respondents, of four respondents, and so on. Then taking the 
average intercorrelations previously reported, the correlations which 
should theoretically be expected were estimated, by using the Spear- 


man-Brown formula. The closeness of agreement of the two is shown 
in Table V. 


TaBLE V.—INCREASE IN INTER-AGREEMENT OF GROUPS RESULTING FROM INCREASE 
IN S1zE or Group 














Intercorrelations in 
Study in which Stud . 
Study in which members of class St ” Rene rdt 
Number of respond-| members of class | rated one another pine xe 4 se . gh 
ents in composite | ranked oneanother| by a _ school buildings 
Esti- Esti- Esti- 
Actual oan Actual ane Actual nied 
1? .34 .34 .26 + .16 . 26 .96+ .01 . 96 
-— | eee “ .37+.14 .42 
4 72+ .02 . 66 .63+.11 . 52 .99+.002) .99 
ie, ee oT .71+.03 71 
10 .88 5 OS eee —_ . 997 . 996 
Pe errr 7 .84 .79 
a, a heek ae .  . - re CPR errecer se . 996 
— a ore eee Te oe a ueses .91 























1 Estimated by the Spearman-Brown Formula. 

* The figures opposite 1 are the average intercorrelations already calculated. 

* The correlations opposite 20 and 22 are the estimated correlations of the full- 
group from the correlations of the half-group, opposite 10 and 11. 


There are twenty respondents in two of the studies and twenty-two 
in the other. In the first two studies, composite scores of ten respond- 
ents were correlated with composite scores of the second ten; in the 
last study, composite scores of eleven were used. Then by using a 
different form of the Spearman-Brown formula,' it was possible to 
estimate what would be the correlation between the full group and 
another theoretically similar group. The actual correlations between 
the half-groups and the estimated correlations between the full- 
groups are shown in the last four rows of Table V. 





1 Ibid., p. 206. 





et a 


one 
Fi es 
See SS em 


Pe “Sess eae) 
ee Sos 
Oe etm ES ee 


492 The Journal of Educational Psychology 


The estimation of the correlation of the full-group, when the cor- 
relation between the half-groups is known, may be illustrated in the 
study in which members of a class rated one another by the paired- 
comparison technique. The correlation of one group of eleven with 
the second group of eleven was, by actual calculation, 0.84; the 
estimated correlation by the Spearman-Brown formula of the full 
group of twenty-two with another theoretically similar group of 
twenty-two was 0.91. 

Here is a curious phenomenon: The composite decisions of one 
group agree with the composite decisions of a second similar group, 
and yet the individuals making up the composites agree only to a 
slight extent among themselves. Except for a few experts, then, 
the opinions and judgments of the group seem to be more valuable, 
as judged by the criterion of objectivity, than the opinions and judg- 
ment of the individual. If agreement among groups can be taken as 
one evidence of validity, the correction of data by means of the 
questionnaire would seem to have considerable justification, even in 
studies involving opinion and judgment. Caution must be exercised, 
however, in accepting high objectivity as full proof of validity. Objec- 
tivity and validity are not necessarily synonymous. Groups may 
agree highly among themselves and yet at times be wrong. Accepted 
as some proof of validity, they must not be considered infallible. 

Further evidence that the inter-agreement of groups is far higher 
than the inter-agreement of individuals is found in Table VI, which is a 
condensation of five different studies. The table can be studied 
effectively by referring to the key on the page preceding it. The first 
column of the table contains the groups correlated. 1-2, for example, 
means the correlations between group one and group two. However, 
group one and group two are different for each of the five studies. 
In the author’s study, column two, group one was composed of sopho- 
mores, spring of 1926; group two, a summer-school class of teachers, 
summer of 1926. Group one in Counts’ study, column three, was 
composed of one hundred twenty-six seniors in the Bridgeport High 
School; group two of seventy-eight seniors in the Meridian High 
School. The identification of all groups can easily be found in this 
Key to Table VI. All correlations, except those carried out to two 
decimal places in Bennion’s study, column four, were calculated by the 
author. | | 

A brief description of the studies will help to make clear the 
significance of the different group inter-agreements. 
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In the author’s study, column two, students and teachers in differ- 
ent classes in civic education were asked to make a list of twenty-five 
greatest Americans, past and present. The correlations show the 
agreements of one class with another with respect to the number of 
times each of sixty-four Americans who were mentioned at least once 
by all four groups, was included in the list. 

In Counts’ study,' different groups, as indicated in the key, were 
handed a list of forty-five occupations and asked to arrange them 
“in the order of their social standing.’’ The correlations are between 
composite rank orders of the occupations, assigned by the different 
groups. Rho, not r, is the measure of relationship shown in this 
column. 

In Bennion’s study,” the groups indicated in the key were asked to 
check one hundred thirteen Biblical selection (titles only, given), in 
one of six columns, indicating whether each selection would have very 
great, more-than-average, average, less-than-average, or little interest 
for students of high-school age. The sixth column in that study was 
to be check if the respondent was not sufficiently familiar with the 
selection to form a judgment. Only the frequencies of mention in the 
column headed ‘‘Items of very great value” were used in calculating 
the correlations between the different groups. 

In Hunter’s study,’ the groups indicated in the key were asked to 
check, from a list of fifteen ‘‘causes for dismissal’’ of teachers, those 
items which they would consider sufficient causes for dismissal. Only 
the votes which were unqualifiedly for dismissal were used in the 
calculation of the correlations among the different groups. 

In the California Curriculum Study,‘ the groups indicated in the 
key were asked to check, from a list of twenty-seven subjects included 
in the elementary school curriculum of the state of California, which 
subjects were very important and should be retained in the curriculum 





1 Counts, George S.: ‘‘The Social Status of Occupations: A Problem in Voca- 
tional Guidance.”’ School Review, Vol. XXXIII, January, 1925, pp. 16-27. 

? Bennion, Adam §.: ‘‘ An Objective Determination of Materials for a Course of 
Study in Biblical Literature.” Unpublished Ph.D. Thesis, University of Cali- 
fornia, 1923, pp. 137-138, 158, and Appendix, p. 8. 

*Hunter, Fred M.: ‘‘Teacher Tenure Legislation in the United States.” 
Unpublished Ed.D. Thesis in the University of California, 1924, pp. 69-70, 73-82. 

‘ Bagley, William C. and Kyte, George C.: The California Curriculum Study. 


Berkeley, California: The University of California Printing Office, 1926, pp. 414- 
419, 
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Key to Table VI 


Author 


1. Thirty-one college sophomores. 

2. Forty-two teachers in a summer school class. 
3. Fifty-eight college sophomores and juniors. 
4. Fifty-two college sophomores and juniors. 


Counts 


1. One hundred twenty-six seniors in the Bridgeport High School. 

2. Seventy-eight seniors in the Meridian High School. 

3. Sixty seniors in the Milwaukee Trade School for Boys. 

4. Forty-two seniors in the Wallingford High School. 

5. Sixty-two freshmen in the college of Agriculture, University of Minnesota. 
6. Eighty-two Minneapolis school teachers. 


Bennion 


1. One hundred ministers. 

2. One hundred Biblical scholars and teachers. 

3. One hundred high school English teachers. 

4. One thousand pupils who had studied Biblical literature. 


Hunter 


1. One thousand twenty-four classroom teachers. 
2. Two hundred ninety-five city superintendents. 
3. Two hundred ninety-one principals. 

4. One hundred forty-nine ‘‘intelligent” laymen. 


5. One hundred twenty-one presidents of state teachers colleges and normal 


schools. 


6. Seventy-two deans of teachers colleges, professors of school administration, 


and other college professors. 


7. Fifty-seven presidents of state universities and colleges. 

8. Forty-nine county superintendents. 

9. Thirty-nine state superintendents of public instruction. 
10. Fifteen presidents of state parent-teacher organizations. 


Bagley-Kyte 


Sixty-six lay citizens. 

Two hundred seventy elementary school principals and teachers. 
Two hundred sixty high school principals and teachers. 

. Twenty-three school superintendents. 

Thirty-one county superintendents. 

. Seventy-seven college teachers of education. 
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TaBLeE VI.—CorRRELATIONS AMONG GROUPS IN Five DIFFERENT STUDIES 





Correlations in 











Groups ; 
eorrelated Author’s study | Counts’ study |Bennion’s study ——— ¥ wo dy California 
of greatest | of occupational! of Biblical psi “| curriculum 
Americans preference interests missal © study 
teacher 
1-2 . 896 .981 . 96 849 .847 
1-3 .938 .979 .93 .951 .929 
14 .918 . 954 .774 .908 . 839 
Pe et Api geiel - Re ae ee . 835 .843 
eee CS piel ee .942 . 560 
——)6=—Ul OCC Regeeee ~ Oe Packewee Oe ee .779 
iol B) aipeebide: iE io Waewedde aA. “comkbieda .901 
ta ee -. ' gilieees (iy wbeetae (Me 864 
ae 9S * seteege HO gate BW au. .790 
2-3 .908 . 967 .92 . 936 .902 
2-4 911 .973 .772 .801 .925 
ee. © peeves ee ea .981 .943 
a ery ree "a . 885 .845 
FS ne eee ee OD fae . 854 
er) UO. ake: ie eee ee ee ad .931 
eer) (ers oe eee 2 res .952 
i ae fee Lene .657 
3-4 . 893 .950 .791 .909 .916 
a ae [. a). Cee re .901 .913 
a ese eer [ 7 ere .906 .745 
ar. a * gatatae °° wba” We beemcc. .875 
I Vette Gee ee nee .946 
SE a ae ee ae . 892 
Beer) ee ese free .749 
eres a 2 .756 .952 
I Tia LS sree .813 .822 
ee Se” scagsbe ' ER ' "Naeeeee § OH ves aun. .907 
rmal ge: A. saa aaa .858 
‘SE Serer ae en .800 
tion, ee ete eee Ml! A .776 
ES ee eer et Bh’ weeveas .879 .780 
as. psveacet "S . sdamde |B vets « . 826 
eee. ie! gadeel ” GP sieges ate. .909 
I AMD” \deweege , MPA git ee” Mo. . 956 
ge I ee Sener .663 
Re ee .748 
ES a es ee ree .891 
OP ee a aa, ener . 876 
ron.” “dieeees “pace .796 
ES So ee Se eee, res .811 
ES SCE ee ee een .514 
et ee Oe .682 
ee 0 ieee .858 
NR TERS: oe a arm .732 
ae a eo on a, lees .671 
Range........ .89-—.94 91-—.98 77—.96 .51-—.98 .56-—.95 
Mean......... 911 .955 .858 .838 -851 
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and which were sufficiently unimportant to justify their being dropped. 
The correlations are between the ‘“‘very important’ columns. 

The agreement of groups in these different studies, carried on by 
different investigators at different times and under different conditions, 
is surprisingly high and surprisingly uniform in showing the extent of 
agreement among groups. The range of the correlations showing this 
agreement of group with group is, for Table VI, 0.51 to 0.98; the 
average for the table is 0.87. If the coefficient of alienation is used, 
as it was used on page 485, to estimate how good a given correlation is, 
it is found that the coefficient of alienation for a correlation of 0.87 
is 0.49. Subtracting 0.49 from 1.00 shows that the prediction of one 
group’s responses from another group’s known responses would be 
fifty-one per cent better than chance. The responses of the group 
seem to have considerable stability and considerable merit. Even 
though the inter-agreements of the individuals composing these groups 
are low (about 0.30 to 0.35, as shown on page 492), the idiosyncrasies 
of the individuals seem to be ironed out as the responses of more and 
more individuals are merged into a composite. 

It must be noted, however, that even this high group objectivity is 
considerably below the average individual objectivity attained by the 
scorers of the school building reported on page 489. The high objec- 
tivity of the group does not reach the objectivity attained by individual 
respondents when the latter are trained in the understanding and use 
of common standards; the objectivity of groups would probably be 
far below that of individuals who used mechanical aids, where that is 
appropriate and feasible, in the formulation of their opinions and 
judgments. 
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THE INTERRELATIONS OF ADULT INTERESTS 


EDWARD L. THORNDIKE 
Institute of Educational Research, Teachers College, Columbia University 


If the ratings of an individual foreach item forage twenty to twenty- 
nine, thirty to thirty-nine, forty to forty-nine, and fifty to fifty-nine 
are summed, they provide convenient estimates of his likes and dislikes. 
Ratings independently made nearly two years later by fifty-four of 
the one hundred twenty-two individuals studied show correlations 
with the earlier ones for such sums as follows: 








Item Correlation Item Correlation 

1 .58 9 71 
2 71 10 .68 
3 .74 11 .89 
4 . 64 12 71 
5 .76 13 .83 
6 .83 14 .72 
7 .57 15 .80 
8 51 16 .70 

Sum of 1-16 . 85 














These summed ratings are then surely fairly indicative of what the 
person thinks his likes and dislikes are, and are probably fairly indica- 
tive of what his likes and dislikes really are. Their interrelations are 
thus of interest. They appear in Table XIII for one hundred sixteen 
college graduates for whom we had record sheets completely filled. 

I present the correlation coefficients uncorrected for attenuation 
and shall argue from these, because the errors of the ratings are cer- 
tainly not entirely uncorrelated, and the straight-forward application 
of the correction formula is consequently unjustifiable. There is a 
tendency for the ratings of the sixteen interests by one individual to 





1 The investigations reported here were made possible by a grant from the 
Carnegie Corporation. 


497 


q ./ " 

ai 

q 

th ‘| : 
j 


—>S- >: 
_- - ee. 
aes 
eee 


== 


+ = ea Oe ee +t 
= 
— 
ia 





a ee 
a ee = ee SS ese IN SF SS 
= 











ee 
a 


"=e 
9 


St 


498 The Journal of Educational Psychology 


be all too high or all too low, shown by the fact that the sum of the 
sixteen correlates only .85 with the sum of the sixteen two years 


later. The ordinary correction formula would in so far forth give too 
high intercorrelations. 


TasBLeE XIII.—INTER-CORRELATIONS OF SIXTEEN INTERESTS AS RATED BY ONE 
Hounprep SrxTeEN CoLLeGE GrapuaTes, UsING THE AVERAGE OF THE 
RatinGs ror AGEs TWENTY TO TWENTY-NINE, THIRTY TO THIRTY-NINE, 
Forty TO FOoORTY-NINE, AND Firry To Firty-NINE. DECIMAL 
Points ARE OMITTED. STANDARD DEVIATIONS AND MEANS 
ARE SHOWN AT THE FooT OF THE TABLE 
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The matter is not of great consequence for ‘he purposes of this 
article, since our chief interest will be in large relative differences 
between the coefficients, especially after partialling out the influence 
of average enjoyment, differences in interpretation of standards, and 
constant errors. The reader who wishes to think in terms of the 
correlations which would be obtained from perfect measures of what 
the persons think their likes and dislikes are will probably be not far 
wrong if he multiplies the coefficients in Table XIII by 1%, or, better, 











the 
ATS 
(00 


NE 


omonoaoarkow 





ces 
ice 


he 
at 
far 
er, 





The Interests of Adults 499 


adds to each r in Table XIII, a little more than one half the difference 
between it and the corresponding r in Table XIIIA.! 

Inspection of Table XIII shows lower correlations for this mis- 
cellany of interests than a miscellany of abilities would probably show, 
only five of the one hundred twenty being above .40, and twenty-four 


TaBLE XIIIA.—TxHeE CorRELATIONS OF TABLE XIII CorrREcTED FOR ATTENUATION 
BY THE ORDINARY FORMULA. BECAUSE OF THE CoNSTANT ERROR DEscRIBED 
IN THE TEXT, THE ENTRIES ARE SOMEWHAT Too Hicu. Decimat Points 
ARE OMITTED 
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being negative. Relatively high correlations are found among 1, 2, 7, 
and 10 (.25, .23, .45, .13, .36 and .31), also between 4 and 5 (.32) and 
between 8 and 9 (.40), and among 13, 14, and 15 (.35, .45 and .52). 6 
is opposed to most of the other interests, and has only weak affiliations 
with 1 and 5. 11 is negative or very low with all the interests except 
those suggesting dealings with persons (12, 13, 14, and 15). 12 is 
related to reading the newspaper and to the 13, 14,15 group. l6isa 





1I give in Table XIIIA the coefficients of Table XIII, each raised by the 
application of the ordinary formula, to save anyone who wishes to use them the 
labor of computing them. 
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catholic interest, allied to 2, 3, 4, 8, 10, 13, 14 and 15, but especially 
to 15. 

Table XIII shows an antagonism between the interest in reading 
(1 and 2) and the interest in people (13, 14 and 15). The six correla- 
tions are all negative; those for reading fiction are —.38, —.30, and 
—.25. It also suggests an antagonism between interest in one’s reg- 
ular job and interest in the ordinary recreations (reading, athletics, 
games, dancing, music, and even travel) since the positive correlations 
that do occur are low and may be accounted for by a constant error 
of certain individuals to rate all likes too high. 

There may be such a tendency. The sum of the sixteen sums is 
very high for some individuals and very low for others, as shown in 
Table XIV; and the separate likings do all correlate positively with it, 
the r’s for 1, 2, 3, etc., being, in order, .31, .35, .38, .40, .44, .06, .36, 
47, .50, .33, .24, .37, .39, .52, .66, and .52. All but two of them cor- 
relate with it more than the .25 which would be due to its being their 
sum (assuming that all have the same weight in the sum). 


TaBLE XIV.—TuHE VARIATION IN THE SuM OF THE SIXTEEN Sums 








Sum of the sixteen | Number of persons | Sum of the sixteen | Number of persons 
credits reporting credits reporting 
20-29 1 170-179 15 
30-39 aa 180—189 6 
40-49 1 190-199 5 
50-59 2 200-209 4 
60-69 210-219 3 
70-79 1 220-229 3 
80-89 3 230-239 
90-99 5 240-249 

100-109 6 250-259 

110-119 4 260-269 1 
120-129 11 

130-139 15 

140-149 14 

150-159 6 

160-169 10 














The sum of the sixteen sums is indicative of the degree to which a 
person thinks he enjoys the sixteen items on the average. It is 
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determined by his average enjoyment, his standards for the values 
from +5 to +1 and from —5 to —1, and any constant errors which he 
makes in estimating his likes and dislikes in accordance with his 
standards. If the standards were alike for all and if there were no 
constant errors, the sum of the sums would measure the individual 
differences in average enjoyment. We shall later consider how much 
of its variation is due to realities of likes and dislikes and how much 
to differences in standards and constant errors. 

We may eliminate all influence of average enjoyment, standards, 
optimism and other constant errors by “‘ partialling out” the sum of the 
sums. The correlations of Table XIII then become those of Table XV, 
representing what would be found in a group whose members were 
alike in the sum of sums, but in all other respects like our group. 
They represent the interrelations of the reported likes and dislikes 
freed from all influence of differences in general enjoyment, standards, 
and constant errors in all sixteen ratings. It appears at once that 
the correlations of Table XV are small (four-fifths of the one hundred 
twenty are between —.20 and +.15) and that the great majority of 
them (sixty-eight percent) are negative. 

The linkages suggested by Table XIII are corroborated by Table 
XV. 1, 2,7, and 10 show correlations of .16, .13, .38, .01, .28 and .22, 
and account for three of the eight closest correlations. 13, 14, and 15 
show correlations of .19, .28, and .28 and account for two more of the 
eight closest correlations. The r’s of 4 with 5 and 8 with 9 are .17 and 
.22. The other correlations above .20 are 2 with 3 and 15 with 16. 
1, 7, and 10 are strongly antagonistic to 13, 14 and 15. Interest in 
one’s regular job is antagonistic to interest in recreation, especially 
sedentary games, dancing, theatre and movies. There is apparently 
a complex set of links between the interest in reading newspapers and 
those in reading non-fiction, the theatre or movies, and politics. There 
is another set of linkages between the interest in sedentary games and 
the interests in reading fiction, and (much less emphatically) in non- 
competitive outdoor sports and in the theatre. 

I have not had time to subject the facts of Table XIII or of Table 
XV to more searching analysis by the methods of Kelley, Thurstone, 
or Hotelling. The obvious presence of many important factors or 
components specific to single interests, and the group factors for read- 
ing (other than newspapers), outdoor sports and outdoor competitive 
games, making music and listening to music, and interest in people 
would, I judge, make such an analysis very laborious. How much it 
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would add in the way of suggestions concerning the nature of the 
factors I am unable to say. 

Let us turn now to the sum of the sixteen sums and try to discover 
how far it measures real differences among the one hundred sixteen 
persons in the way of greater average enjoyment by some than by 
others, and how far it may measure only differences in their standards 
and errors in their judgments. First of all, I have considered the 
highest tenth and the lowest tenth in this score. Five of the high 


TaBLE XV.—TuHE INTERCORRELATIONS OF TABLE XIII, FREED FROM THE 
INFLUENCE OF THE AVERAGE LIKING FOR ALL SIXTEEN ACTIVITIES 
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twelve and six of the low twelve are personally well enough known to 
me so that I should feel competent to rate them for average enjoyment 
of these sixteen activities. In ten of these eleven cases I should 
unhesitatingly have rated the high as above average in average enjoy- 
ment of these sixteen items and the low as low. The highest of all 
(reporting plus values for all sixteen with an average of 4.2) is in my 
opinion a substantially true record. The lowest of all (reporting 0.4 
as an average) is also, in my opinion, little if any below the truth. 
Practically all his enjoyment has been obtained from a special form of 
activity not here listed. 
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Next I have compared the twenty-three psychologists, who should 
be less subject to inaccurate standards and constant errors of self- 
observations than the teachers, lawyers, physicians and business men, 
with them. The psychologists show almost the same median and 
variation in the sum of the sixteen sums as the others (14714 with an 
average deviation of 32 for psychologists, 146 with an average deviation 
of 31 for the others). 

In the third place, we must bear in mind that the standard of 0 is 
subject to almost no misconception, and that the standard of +5 is 
fairly well defined by certain experiences of sex, triumph, exaltation, 
some of which had almost certainly been experienced by practically 
all of the one hundred sixteen persons. It is in the standard for —5 
that great diversities might be possible because some of the persons 
may never had excruciating bodily pain or mental anguish. But this 
standard has relatively little influence because so few of the activities 
are positively disliked. 

As to constant errors of moving estimates of likes unduly far from 0 
toward +5 and estimates of dislikes unduly far from 0 toward —5, 
the former probably does exist. Persons who are less able and who 
assign the ratings with less care give an unduly large number of ratings 
of +5. But the magnitude of this error in our group is probably not 
large. As we have seen, it is not large enough to cause a difference 
between the psychologists, who would guard against it, and the others. 

As to a tendency toward optimism which would push estimates of 
likes unduly toward +5 and would push estimates of dislikes unduly 
toward 0, in some persons, and, conversely toward pessimism in other 
persons, it probably does exist to some extent. Thus the psychologists 
seem less optimistic than the others since their downward variation is 
larger (.70 to .60) and their upward variation smaller (.58 to .64) but 
not with great reliability. Such differences in optimism versus pessi- 
mism are likely to be highly correlated with differences in the number 
of things liked and the amount of liking. 

So on the whole it seems probable that differences in the sum of the 
sixteen sums represent, at least in part, real differences in the number 
of things liked and the amount of liking, and in general optimism 
versus pessimism. The factor partialled out in Table XV is then 
probably in part a genuine difference in vigor and breadth of interests, 
and partly a difference in optimism which may well be closely corre- 
lated with the former. 
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We may then assume that the general features of the organization 
of Items 1-16 will correspond fairly well with the general features of 
the organization of interests in these persons as determined by an 
omniscient observer and, until more or better data are available, may 
conclude that the following are important factors in the constitution 
of the interests of these adults and probably of people in general. 

I. Average or total tendency to enjoy. I shall call this Gen. Like 
though it is not general in the strict sense. It seems opposed to the 
interest in sedentary games and one’s regular job. Differences in it 
contribute something but not much to the individual differences in 
Items 1, 2, 3, 4, 5, 7, 8, 9, 10, 12, 13, 14, 15, and 16. 

II. Liking for social intercourse including talking. The inter- 
correlations of 12, 13, 14, 15, and 16 in Table XV average .12, much 
above the median of the 120. 

III. Liking for utility. 2,3, 11, 12 and 13 in Table XV average 
04. 

IV. Liking for the world of ideas and fancy. 1, 2 and 10 in Table 
XV average .27. 1 and 10 correlate .38. 

V. Liking for music. 8 and 9 correlate .22. 

VI. Liking for outdoor sport. 4 and 5 correlate .17. 

It is worth noting that the likings for activities commonly con- 
sidered as self-indulgences do not show much community after the 
partialling out of the Sum score. 1, 7,8, 9, 10 and 16 do intercorrelate 
by an average of .04, but if 1, 5 and 6 are introduced, the average 
becomes —.03. This is still a bit above the median of Table XV, 
(—.054%4). We may posit love of pleasure or self indulgence as a 
VII, but the persons in this group do not differ much in it, save via 
Gen. Like. 


ADDITIONAL DATA 


As a check upon all the results so far I secured similar ratings from 
three other groups of educated adults numbering sixty, twenty-five 
and twenty, and including both men and women. 

I report in Tables XVI and XVII the intercorrelations and the 
partial intercorrelations after the influence of individual differences 
in tendency to enjoy, in optimism, and in the use or misuse of the 
scale has been removed by partialling out the 7’s with the sum of the 
sixteen ratings. Table XVII is not derived from Table XVI, but 
each is derived from three separate tables for these groups of sixty, 
twenty-five and twenty for each of which the partial r’s were obtained 
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separately. Combination was made by averaging, giving the sixty 
group a weight of two and each of the others a weight of one.! Since 
none of the r’s were high it did not seem wise to spend the time to make 
Zeta transformations. 


TaBLE XVI.—INTERCORRELATIONS OF SIXTEEN INTERESTS AS RATED BY ONB 
HuNDRED Five EpvucatTep ADULTS 











1 2 3 4 5 6 7 8 if) 10 | 11 | 12 | 13 | 14] 15} 16 
2 zg 
a] & o >. a=) = 
a A ‘ E - 3 2 2 © 8 | 
= @ideol¢é a = gin “4 les el-g |Sum 
Gia] aloe ) 3 | ° - » |S lB 
° A =| 9° > Me 2 o |- @ 4|0 
218 5 ne 2 ee ee - |. Falg 
Sle) sisaiss sie) Fial |e) 2] § | 28) Fsloe 
3 3 3 Zeiss es = a s 4 4 = = |\Rolike >-s 
©] e@|ssisg 3 @¢)/S i 2lel2isl®& (stl sgize 
meiminal6°l6 AZ ISAIBR SZ a |e las Stic2 
1 38 | 15 | 21 |—06) 16;—O1] 07! 14 48; 06) O09) 11} 47) 26) 33] 49 1 
2 .. | 28 | O07 |—12| 10|—14| 15) 08 22; 30; 12; 14) #27; 29) O2| 38 2 
3 i 13 |—06| 13|;—06} 19) 13 26; O02) 20) 34) 27| 27|—00] 42 3 
4 a 34|—09; 33|—04/) 21 14;—18; 04) O11} 23) 13) 18) 38 4 
5 ..|—03} 14|;—07| 10 |—21| 10) 11|—07|—01|—03)—09) 27 5 
6 Te 03; —06) 10 20; 04) 16) 11) 25) 03|—10) 30 6 
7 25| 09 15; —23|—09; 24) 21) 18) 31) 39 7 
8 .-| 41 13|}—07; O11; 18) 24) 18) 16) 40 8 
Se «« i a ve: Bedadis vahve edie sees 10; 13) O8} 18) 29) O9| 12) 46 9 
10] .. - 7 ve Bess cbbe@abtiovebneel «a deisel 2h. a. of 6a Ue 
1l ~ si ba + Benes sdebesdaiedsah wi devesiesent 2: a. Ge. 2 Sees 
ee we ae oy Te er eee ee See ae ee on ae a 66 le Ur UL 
So a % i ie Besw sede Sakessahewesk wa Beveenks ssdbotcacinasal 2 2 ae oe 
14 a a = we ee hae ee! ee, eo: ee ae [o 41) 44) 67 | 14 



























































The particular correlations from these additional records vary con- 
siderably from those from the one hundred sixteen men, but the evi- 
dence for the existence of III, IV, V and VI is emphatic. The evidence 
for Il is weaker. The average of the partial correlations for 12, 13, 14, 
15 and 16 is —.03, which is only a little above the median of Table 
XVII (—.0514). The evidence is rather against VII. 1, 5, 6, 7, 8, 
9, 10 and 16 are linked only by Gen. Like, optimism, and possible 





1 The smaller groups were given more weight because they were more homo- 
geneous and younger, and so presumably less subject to errors of memory. The 
measure of liking for the group of sixty was the liking at twenty to twenty-nine 
plus that at either forty to forty-nine, or fifty to fifty-nine, or sixty to sixty-nine. 
The measure for this group of twenty-five was the liking at fifteen to nineteen plus 
that at twenty to twenty-nine. The measure for the group of twenty was the 
liking at twenty to twenty-nine plus that at thirty to thirty-nine. 
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constant errors. The correlations of the separate likings with the sum 
of all are well above the .25 to be expected from their shares in con- 
stituting it, and it is reasonable to suppose that Gen. Like accounts 
for some of the variance in most of them. 

As these persons were not well known to me, I could not test this; 
but strong evidence for the existence and influence of Gen. Like appears 
in other data collected by Dr. Lorge and by myself. 


TaBLeE X VII.—TuHE INTERCORRELATIONS OF TABLE XVI AFTER PARTIALLING Ovr 
THE SuM OF THE RATINGS 
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All these records go to show two general facts. First, there is 
great specialization of interests. Second, such group factors as appear 
seem more related to characteristics of the situations responded to 
than to unitary ‘‘traits” in the persons. Music, sport, friendly inter- 
course and talk, fiction and drama are certainly more obvious, and 
are probably more significant, as organizing causes than conscientious- 
ness, pugnacity, love of achievement, curiosity, craving of bodily 
exercise, and the like. 

The nature of the sixteen items rated might conceivably account for 
these two facts, at least in part, but I find evidence to the same effect 
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in an investigation in which five hundred eighty-seven items, and also 
all those of the Strong Vocational Interest questionnaire, were rated. 
Doubtless there are differences in the genes of men which cause some 
men to like music, literature, the theatre, art, animals, children, 
gardening, neatness, religion, or fighting more than othermendo. But 
it is not easy to discover what they are. 
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THE VOCATIONAL INTERESTS AND PERSONALITY 
TEST SCORES OF A PAIR OF DICE 


PAUL 8S. BURNHAM AND ALBERT B. CRAWFORD 
Yale University 


As a matter of theoretical interest in respect to the possible chance 
determination of scores made on Strong’s Vocational Interest Blank, 
Bernreuter’s Personality Inventory,?, and Thurstone’s Personality 
Schedule,* it was decided to investigate what scores would be made 
if the response to each item in the tests were to be determined by 
dice. Under these conditions the answer given to any question would 
not express the reaction of any individual, nor would there be anything 
but a chance relationship between the answers to any two questions. 
The results are here reported, in no spirit of captious or destructive 
criticism. On the contrary, enlightenment is sincerely sought on the 
true significance of scores such as these, when made by individuals. 

It may be recalled that to each of the one hundred twenty-five 
items on the Personality Inventory and the two hundred twenty-three 
items on the Personality Schedule the subject makes a choice from 
one of three alternative responses, Yes, No and ?. In filling out the 
blanks, if the die turned up a one or a four this was arbitrarily taken 
to indicate the answer Yes; a two or a five, No; a three or a six, ?. 
Each of the four hundred twenty items on the Vocational Interest 
Blank also offers the subject three alternatives and the same procedure 
was used to determine the responses. In this way ten Personality 
Inventories, ten Personality Schedules, and ten Vocational Interest 
Blanks were filled out. . 

To investigate experimentally the chance incidence of dice throws, 
data had previously been secured on the distribution of die faces on 
thirty-two hundred twenty-four successive throws. The following 
table shows the extent to which the probability of securing one of the 
following combinations approximated 14, to which figure it should 
correspond, theoretically at least: ; 


PERCENTAGE INCIDENCE IN 


Dis Face Tarrty-Two HuNnpRED TWENTY-FouR Casts 
1 or 4 32.6 
2 or 5 32.1 
3 or 6 35.3 
100.0 


N.B.: The probable error of each of the above percentages is approximately 
+0.7. 
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The scores actually secured on the Personality Inventory and 
Personality Schedule are profiled on the accompanying charts. The 
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would be classified in nine cases as Emotionally Maladjusted and in 
one case as In Need of Psychiatric Advice. 

The forty Bernreuter Personality Inventory dice-determined 
scores are profiled on Chart II. The interesting feature about this 
profile is that most of the scores are grouped pretty well together and 
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(on a scale ranging upwards from the most to the least “desirable” 
personality traits) rank well above the fiftieth percentile in respect to 
the criterion group of three hundred seventy-six college men. Only 
one score is significantly below the fiftieth percentile and that has 4 
percentile score of thirty-one on the Dominance-Submission Scale. 
If this were read the other way about (7.e., more Dominant than 
sixty-nine per cent) it would be classed as a deviate of importance. 
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It is quite evident that these dice-determined scores do not—as one 
might suppose they would—group about the fiftieth percentile. 
Rather do they quite markedly group between the fiftieth and ninetieth 
percentiles, with a notable concentration between the seventy-third 
and ninety-first percentiles on the Dominance-Submission scale. 
In other words, the mean thus determined by chance lies distinctly 
in that part of the scale which expresses less than normal effectiveness in 
social adjustment. In this respect the chance-obtained data on this 
test agree with those secured on the Thurstone Personality Schedule. 
Since the latter was employed as a criterion in developing the Bern- 
reuter Scale, this result is what might be expected if consistency char- 
acterized the chance-determination of replies to both tests. Such 
findings should not be interpreted as attacking the validity of the 
tests however, if it is possible that emotional instability, or social 
ineffectiveness, may be properly represented by a random, rather 
than a controlled and positive, series of reactions to questions of the 
type involved in these personality inventories; and might therefore 
properly agree with answers determined by chance rather than by 
choice. 

When each of the ten Vocational Interest Blanks had been scored 
for twenty-seven occupations, but one A score (representing most 
positive correspondence of interests with those of some occupational 
group) was found. The majority of the scores were C (representing 
no correspondence of interests with those of the occupation in ques- 
tion). Table III shows the distribution of the A and B scores (repre- 
senting, respectively, probable and possible correspondence with the 
interest pattern typical of successful men in these occupations). 


Scores ON TEN VOCATIONAL INTEREST BLANKS 
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The B and B-scores on the Lawyer, Physician, Vacuum Cleaner 
Salesman, Purchasing Agent and Real Estate Salesman scales might be 
looked upon as chance deviates. However, the same certainly cannot 
be said in respect to the Boy Scout Master and Journalist scales, 
Here the consistency with which B scores (in addition to the one 
A score) were secured from dice throws, would indicate that some 
element associated with a random response factor may have been 
operating in the original determination of these scales. 

From these data it may be concluded that it is perfectly possible to 
secure by chance, scores on these tests of a nature which, if made by 
human subjects, might be regarded as significant and which, in 
present practice, are frequently so interpreted. Some of the questions 
which these findings raise are: 

1. What cautions must be observed in the interpretation of scores 
which fall within the ranges of those secured by dice-throws? 

2. Is it possible that the personality scales used are by their natures 
heavily weighted in such a way that chance responses will probably 
produce deviate scores of presumably significant nature? 

3. Is a neurotic, emotionally unstable, introverted, dependent or 
submissive person the type that the psychiatrist or psychoanalyst 
would expect to answer such a test haphazardly, thus reacting in 1 
manner similar to that of chance-determined responses? 
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GROUP FACTORS IN COLLEGE CURRICULA 
CHARLES I. MOSIER 


University of Florida 


Although the past few years have seen a great deal of research into 
the mental abilities which can be differentiated in test performances, 
little work has been done in an attempt to verify these abilities in the 
actual situations of adaptation to exclude the possibility that those 
abilities are artifacts or the products of the test situation. This study 
is designed to make a preliminary investigation in this field. It is 
hoped that the results will form a basis for more thorough investigation. 

The problem as formulated involves the determination of which of 
the abilities measured by the American Council on Education Psycho- 
logical Examination are required by each curriculum offered at the 
University of Florida. The formulation of the problem rises out of 


four hypotheses and the deductions from them. These hypotheses are 
made explicit below: 


1. Independent Abilities—That each of the five parts of the A. C. E. Psy- 
chological Examination measures, in addition to a general factor measured 
by all parts of the examination, a specific or group ability which is independ- 
ent of this general factor. Thus, for example, the Completion and Opposites 
tests might be considered, from an a priori analysis, as measuring in addition 
to the general factor, a special verbal facility. This hypothesis follows the 
position taken by Thurstone in his Theory of Multiple Factors... In connection 
with this problem, there is no claim in the hypothesis as to the nature of any 
of these abilities, except that G, or general ability, be defined as that factor, or 
group of factors, common to all five tests, and that the specific factors, or 
group factors, be independent of G, and measured differentially by the parts 
of the Psychological Examination. 

2. Differential Curriculum Requirements.—That certain of the curricula 
offered at the University of Florida demand these abilities differentially, 
certain of them being essential to success in a given curriculum, others being 
non-essential to success. Thus Journalism might, for example, be considered 
as requiring a high degree of verbal facility, if that be one of the independent 
abilities, and as being unaffected by space and number abilities. It is to test 
the adequacy of this hypothesis that the present study is designed. 

3. Selective Effect of Graduation —That graduation from any curriculum 
operates as a selective agent for those abilities requisite to success in that 
curriculum, eliminating those deficient in the ability or abilities which are 
essential, and exerting no such selective force on those abilities which are not 
essential. Men graduating from Journalism we would expect to be selected 
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with reference to verbal facility, and unselected with reference to space and 
number abilities. (It must be remembered that the names attached to these 
abilities are simply convenient labels, and that there is no claim for these 
particular abilities, but only for several independent abilities, of any nature 
whatsoever, measured differentially by the several parts of the Psychological 
Examination. It is to be hoped that the results of this study will aid in 
clarifying the nature of the abilities measured.) . 

4. Relationship between Grades and Abilities—That the grades received 
by a student in a given curriculum are influenced by his abilities, and that 
those grades are more influenced by the essential than by the non-essential 
abilities. While it is true that college grades depend on a number of factors 
other than ability, and that this will operate to attenuate the results, there is 
no reason to believe that this attenuation will operate more in respect to one 
ability than to another. A second limitation upon the operation of this 
hypothesis lies in the fact that while we speak of isolated abilities, we must 
investigate in terms of test scores, and test score depends not only upon the 
specific ability, but also upon the G defined as whatever is common to all of 
the tests. From this we would expect a certain degree of relationship between 
a test score and honor point average, even though the special ability meas- 
ured by that score were not relevant to the particular curriculum under 
consideration. 


Deductions from these hypotheses indicate the method of investi- 
gation. If the first three hypotheses are valid, then it follows that the 
average score of men graduating from a given curriculum will be high- 
est, and that differences between graduates and freshmen will be great- 
est, in the tests which measure the abilities requisite to that curriculum, 
and lower in the tests which measure abilities not demanded by that 
curriculum. The hypotheses relating to independent abilities, cur- 
riculum requirements and the relationship between grades and abilities 
lead to the conclusion that in a given curriculum the coefficients of 
correlation between test score and honor point average will be higher 
for those abilities which are requisite than for those which are not. 
The experimental verification of these deductions will confirm the 
adequacy, though not the necessity of the hypotheses. 

The details of the method of investigation are as follows: The 
subjects were drawn from two sources: Graduates of the University of 
Florida who received their degrees during the years 1932-1934, and 
the freshman classes entering in the fall of 1932 and of 1933. Psycho- 
logical Examination scores on the Completion, Opposites, Artificial 
Language, Analogies, and Arithmetic tests and Total Score on the 
American Council on Education Psychological Examination were 
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recorded for each individual. These scores were all measured in terms 
of standard scores based on University of Florida Freshmen perform- 


ance on each test. The subjects were classified into the following 
curriculum groups: 


Bachelor of Arts 

Bachelor of Arts in Education 
Bachelor of Science 

Journalism 

Agriculture 

Engineering 

Bachelor of Science in Education 
Bachelor of Laws 

Business Administration 


The statistical treatment of the data involved the preparation of 
two frequency distributions in each curriculum for each part of the 
Psychological Examination, one for graduates and one for freshmen. 
The median and the probable error of the median were calculated for 
each distribution; the difference between the medians of graduates 
and of freshmen on each test, the probable error and critical ratio for 
each difference were obtained. For freshmen and for graduates the 
tetrachoric correlation between each part of the test and honor point 
average was obtained by the graphical method.? In the light of the 
analysis of the problem we are concerned primarily with the median 
score of the graduates from a single curriculum for each part of the 
test, the difference between the median scores of graduates and fresh- 
men, and the coefficients of correlation between each part of the test 
and honor point average. It was found, however, that the correlation 
coefficients between Psychological Examination scores and honor 
point average for the graduates were useless because of the attenuating 
effect of the homogeneity of the population. 

Of the three criteria of significance, no one of them can be relied 
upon exclusively. The correlation coefficient between honor point 
average and test score may be spuriously high because of dispropor- 
tionate amount of extraneous courses carried in the freshman year— 
mathematics and science in a Journalism curriculum, or English and 
History in the B.S. curriculum. The difference between the medians 
may be spuriously low because of some selective factor which operates 
to exclude from the freshman group those students who abilities are 
not those required for success in that curriculum. Examples of this 
effect in operation may be seen by referring to Table I, Completion 
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and Opposites results in the Journalism curriculum, and Arithmetic 
and Analogies results for the Engineering curriculum. The absolute 
height of the graduate median in turn is affected by the difficulty of 
the curriculum in question compared with the other curricula offered 
(cf. the results for the Agriculture group where the average of the gradu- 
ates is lower than the average of all University freshmen). In spite 
of the fact that no single one is conclusive, we may consider that two 
or more in combination constitute evidence of the significance of a 
particular test for the curriculum under consideration. 

The results are summarized in Table I. The order in which the 
curricula are presented, and the arrangement of the tests have been 
followed ‘with the purpose of emphasizing the results found. In 
explanation of the form of the table, it will be noted that the entries 
are tabled for each curriculum separately. The first column lists the 
curriculum, the second the measure involved, graduate median 
(Gr. Md.), freshman median (Fr. Md.), the difference between the 
two (Diff.), the ratio of difference to probable error of the difference 
(C. R.), and the correlation between test score and honor point average 
for the freshman group, designated simply as correlation; then follows 
a column for each part of the Psychological Examination, Completion, 
Opposites, Artificial Language, Total Score, Analogies and Arithmetic. 
It will be remembered that medians are measured in standard deviation 
units from the mean of the freshman class of the University of Florida. 
Those graduate medians, differences, and correlations which are judged 
to be significantly high, are underscored. 

Study of the results in Table I more than justifies the hypotheses 
on which the investigation is based. The presence of abilities other 
than the common ability measured by all of the sub-tests and the differ- 
ential requirements of the several curricula are established by the 
variation in the tests from curriculum to curriculum. Moreover, 
the consistency of the results indicates that we are dealing with 
abilities that are more than the fluctuations of chance, or the products 
of the artificial situation. Examination of the results will show that 
the curricula may be divided on the basis of the abilities required for 
success into two groups. The first of these, made up of Bachelor of 
Arts in Education, Journalism, Law, Business Administration and 
Bachelor of Arts curricula, is consistent in showing results for Com- 
pletion, Opposites, and Total Score, that are significant when judged 
on at least two and usually all three of the criteria investigated. The 
group is equally self-consistent in denying significance to the Analogies 
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test, and, with certain exceptions to be noted later, to the Arithmetic 
test. A glance at the curricula included in this group, together with 
the tests which show significance, makes possible a tentative identifi- 
cation of the curricula as dealing with the liberal arts and the social 
sciences, and the requisite special abilities as essentially verbal in 
nature. 

The second major group, composed of the Bachelor of Science, 
Engineering, Agriculture, and Bachelor of Science in Education 
curricula, is equally consistent in affirming the importance of Analogies 
and Arithmetic, and Total Score. This group is identifiable as being 
primarily concerned with the natural sciences. The abilities measured 
by the Arithmetic and Analogies tests are not, however, so readily 
identifiable as were those measured by Completion and Opposites. 
It should be noticed that for every curriculum studied, Total Score 
was of primary significance, though it was not, in every case, the most 
significant single test. In both natural sciences and liberal arts 
curricula there are certain exceptions to these general conclusions, 
and these exceptions are considered further below. 

The first notable exception occurs in the Journalism curriculum, 
where Completion and Opposites, though having graduate medians 
higher than median for Total Score, and showing high correlations 
with freshman honor point average, fail to show any significant differ- 
ences between graduate and freshman medians. This seeming dis- 
crepancy is readily understood by an examination of the freshman 
median score in these two tests. The median scores of freshman in the 
Journalism curriculum are higher for Completion and Opposites than 
are the corresponding medians of graduates from any other curriculum 
except A. B. This result points to the existence of a strong selective 
force operating at entrance to college, to exclude from Journalism those 
who have not the requisite ability. It is the existence of such a 
selective factor which makes it imperative that we consider, not only 
the difference between graduates and freshmen, but the magnitude of 
the graduate median in a particular test, relative to the other graduate 
medians for the same curriculum. 

A second anomalous result in the liberal arts curricula is the impor- 
tance of the Arithmetic test as evidenced by the correlation with fresh- 
man honor point average. This importance is not substantiated by a 
consideration of the other two criteria of significance except in the 
single case of the Law curriculum. The high freshman correlations 
with freshman honor point average can be explained by noting that in 
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most of these liberal arts curricula, all of the mathematical studies 
carried during the four years of college are concentrated in the freshman 
year, and in all of them there is a greater ratio of hours of mathematics 
to total hours carried in the freshman year than in any other. This 
explanation fails, however, to account for the correlation in the case 
of the freshman Law students, for their course of study contains no 
course remotely resembling mathematics in its outward appearance. 
Moreover, in this curriculum, the significance of Arithmetic is indicated, 
not only by the correlation, but by the graduate-freshman difference 
as well. The importance of the Arithmetic test here would indicate 
that the test measures, in addition to a number ability (assuming that 
it does measure that), a reasoning or problem solving ability which is 
advantageous in the sort of work required of Law students. 

Within the natural sciences curricula the outstanding anomalous 
result is to be found in the Agriculture curriculum. There it will be 
noticed that the graduate-freshman differences are negative—the 
freshmen score higher on the Psychological Examination than do the 
graduates. This is probably explainable in terms of the composition 
of the two groups. It will be recalled that the graduates entered as 
freshmen in 1928-1930, whereas circumstances made it necessary to 
take as the freshman group those men entering in 1932-1933. If we 
suppose that the caliber of students electing the Agriculture curriculum 
has improved relative to the average student of the University during 
the years between the two groups, the result is explained. There is 
evidence from other studies that this actually occurred. There is, 
moreover, evidence that the poorer students migrate from other, 
more difficult curricula to the Agriculture curriculum, receiving their 
degrees there, and thus reducing the median score of the graduates 
relative tothe freshmen. Either of these possibilities vitiates any con- 
clusions from the graduate-freshman difference in this curriculum. 
The significance of Total Score, Analogies and Arithmetic is, however, 
shown by both of the other two criteria. 

The absence of significant differences between freshmen and 
graduates in Journalism finds its counterpart in the results of the 
Engineering curriculum for the Analogies test. The explanation again 
is similar, for we find that Engineering freshmen score higher than 
the graduates of all other curricula, indicating the presence of a 
selective factor operating to exclude from entrance in the Engineering 


curriculum those who are deficient in the function measured by 
Analogies. 
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The natural science group shows a further result of interest, 
namely, that the Opposites test appears significant to an extent not 
in keeping with the original division into natural sciences and liberal 
arts curricula. In Engineering this significance is quite marked, show- 
ing up in both the graduate median, and in the graduate freshman 
difference. In the B. S. curriculum, it is evidenced only by a high 
correlation. In B. S. E., Opposites is shown to be significant by all 
three criteria, but in this group the result might be explained by the 
relation of the curriculum to the social science and liberal arts group. 
Whether this significance of Opposites for natural science studies is 
due to its high saturation with G (whatever is common to all tests) 
or whether it indicates the presence of a specific ability, not measured 
by Completion, is not answered by the results. 

These apparent exceptions which have been noted above merely 
serve to emphasize the consistency of the results in separating the cur- 
ricula into the two great divisions, thus substantiating the hypotheses 
of independent abilities, and differential curriculum requirements, 
and demonstrating that these independent abilities are influential 
in determining success or failure in the actual situations of acquiring 
knowledge. 

The Artificial Language test does not conform to the pattern of 
either the Completion and Opposites, or the Arithmetic and Analogies 
tests, though it appears more closely related to the Completion- 
Opposites group. The nature of the curricula in which it appears 
significant—Law, Business Administration, A. B., Engineering, and 
B. S.—makes possible two interpretations of the non-conformity: 
(1) That the Artificial Language test measures an immediate memory 
ability; or (2) that it is relatively high in its correlation with the 
general factor present in all the tests. 


SUMMARY AND CONCLUSIONS 


The American Council on Education Psychological Examination 
scores were obtained for the graduates of the University of Florida 
during the years 1928-1930, and for the freshmen entering in 1932- 
1933. The subjects were classified according to the curriculum 
pursued, and for each group of subjects in each curriculum, frequency 
distributions were made for score on each of the sub-tests and for 
Total Score. The median of each distribution, graduate-freshman 
difference between the medians, probable error of the difference, 
and critical ratio of the difference were computed separately for each 
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curriculum, and for the freshmen in each curriculum the tetrachoric 
coefficient of correlation between sub-test score and honor point 
average was obtained. These results are summarized in Table I. 
From the table the following conclusions seem justified: 

1. That there are several abilities measured by the sub-tests 
in the A. C. E. Examination which are independent of the ability 
measured in common by all the tests. 

2. That these abilities are required differentially for success in 
the several curricula offered at the University of Florida. 

3. On the basis of the abilities required for success, the curricula 
are roughly separable into two groups, one requiring the abilities 
measured by the Completion and Opposites tests, in addition to the 
common ability, the other group requiring, in addition to the common 
ability, the abilities measured by the Analogies and Arithmetic tests. 

4. On the basis of the content of the curricula in these two groups 
they may be designated as liberal arts-social sciences, and natural 
sciences respectively. 

5. The Arithmetic test presents correlations with freshman honor 
point averages which are, in the light of the previous conclusions, 
in excess of expectation. In the case of the Law students, this 
cannot be explained in terms of the relatively higher proportion of 
mathematical studies in the freshman course of study, and indicates 
strongly that the Arithmetic tests measures some ability, probably 
relating to rational problem-solving, required by Law students. 

6. The Opposites test, in addition to its importance in the social 
science-liberal arts studies, shows evidence of measuring some ability 
of importance in scientific work, which ability is not measured by the 
Completion test. Whether this result is due to a saturation of the 
Opposites test with the general ability, or to a special ability, is not 
answerable from the data. 

7. The Artificial Language test does not conform to either of the 
patterns set forth above, but is found to be significant in those curricula 
which require the highest amount of general ability as measured by 
Total Score. There is evidence that this test is either highly saturated 
with the general ability, or measures an immediate memory factor. 

8. Total Score is of importance in each of the curricula investi- 
gated, though in several of the curricula it was not the most important 
single test as measured by the criteria used in this investigation. 

9. The ability measured by the Analogies test (other than general 
ability) is of importance only in the natural sciences curricula, but 
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there its significance is borne out by high graduate medians, significant 
graduate-freshman differences, and high correlations with freshman 
honor point average. 

10. The results of this study indicate that the analytic scores on 
the A. C. E. Psychological Examination are highly useful in the 
practical problems of vocational and educational guidance, and they 
are so being used by the Bureau of Vocational Guidance at the Uni- 
versity of Florida. 

11. The fact that these abilities can be differentiated not only 
in the test situation, but also in the adjustment to college curricula, 
indicates that the abilities in question are not artifacts arising solely 
from the test situation, but represent abilities of consequence in a 
wide number of the individual’s adjustments. 


REFERENCES 


1. Thurstone, L. L.: Theory of Multiple Factors. Edwards Brothers, Ann Arbor 
Michigan, 1933. 

2. Chesire, L., Saffir, M., and Thurstone, L. L.: Computing Diagrams for the Tetra- 
choric Coefficient. University of Chicago Bookstore, Chicago, 1933. 


. ae ee el ek ee le OF ie 





PROGRESS IN THE SCIENCE OF CHILD STUDY! 


PAUL A. WITTY 


Northwestern University 


One of the most salutary developments in contemporary psychology 
is exemplified in the reanimation of genetic study of children. Several 
university centers are giving impetus to this movement and several 
groups are forming which promise to modify the whole structure of 
child psychology. One group finds its leadership in L. M. Terman, 
who, at Stanford University, has done much to alter time-worn 
procedures and to develop new vital technics for making genetic 
studies. Significant experimentation at Yale University, under the 
direction of Arnold Gesell, that at the University of Minnesota, 
directed by John Anderson, and that at the University of Iowa indi- 
cates the réle that the university center is playing in fostering and 
encouraging scientific endeavor. Of course, other centers, too, are 
contributing to our knowledge of child development. 

The experimental child study movement finds high expression in 
A Handbook of Child Psychology, (Second Edition, Revised), which 
appeared about two years after the publication of the first Hand- 
book. The revision seemed necessary because of the recent accelerated 
productivity in research by students of child psychology, and because 
of serious omissions in the first edition. The book is for scholars; 
there is little effort to simplify or to appeal to the popular taste and 
mind. For all persons, however, who are interested in a thorough and 
systematic treatment of the experimental work in child development, 
the volume will have an unusual appeal. 

This book represents a far fling from the questionnaire studies 
actuated by G. Stanley Hall, and from the more recent pioneer work 
of E. L. Thorndike and J. B. Watson. Emphasis is placed upon 
studying and describing growth and developmental levels. Although 
a few original contributions are found, most of the authors survey, 
collate, and summarize significant studies. The focus of the book is 
upon the young child, and it is unfortunate that several of the chapters 
dealing with older children are so inadequately drawn. Lamentable 
omissions include the educational development, the interests, and the 





1 Discussion of A Handbook of Child Psychology—Second Edition Revised. 


Edited by Carl Murchison. Worcester: Clark University Press, 1933, pp. xii + 
956. 
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play of children. Nevertheless, the book is the most comprehensive 
to be found in child psychology. 

The first chapter, by John Anderson, presents an overview and 
appraisal of scientific methods in child study.! Leonard Carmichael, 
in the second chapter, performs a singularly successful feat in assem- 
bling the experimental studies upon prenatal development. Many 
psychologists have regretted the lack of an organized presentation 
of such material in accounts of child psychology. This is “‘an effort 
to describe, by means of controlled observation, and, whenever possible 
by means of experimentation, a series of temporally separate stages 
in development.” This attitude, reflected in the foregoing statement, 
is found to dominate the presentations in several other chapters. 
Thus, attention is directed to the importance of study of stages or 
maturation levels in the development of the child. Much emphasis 
has been placed recently upon the significance of maturation. In 
their enthusiasm, one group of scholars seems to have oversimplified 
the réle of maturation in describing human development. Indeed, 
maturation has been pointed to as the primary cause of the “‘ perfectly 
integrated’’ expansion of behavior patterns. Maturation is the 
shibboleth of certain Gestaltists who appear to have gone beyond the 
limits of their data in attempting to fit child development into 
the narrow confines of a simple developmental formula. Therefore, it 
is appropriate to inquire: Can development be summarized by a single 
formula? 
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Coghill aroused students of psychology and physiology by stating: ‘The 
behavior pattern from the beginning expands throughout the growing normal 
animal as a perfectly integrated unit, whereas particular patterns arise within the 
total patterns and, by a process of individuation, acquire secondarily varying 
degrees of independence . . . complexity of behavior is not derived by progressive 
integration of more and more originally discrete units.”” (Quoted on p. 142.) 


An important postulate of the Gestalt school of psychology may be 
seen by examining the citation from Coghill. Human behavior, we 
are told, is characterized by purposeful reaction of the organism as 
a whole, and small “perfectly integrated units” appear as a result of 
individuation. Undoubtedly the process of individuation is one 
important means of accounting for phases of development; however, 
the writer believes that all behavior is not integrated from the time of 
its initial appearance. Furthermore, much behavior appears pur- 
poseless and random (in one sense, at least). Certainly it is unwise 





1In this review, the writer will comment on selected chapters only. 
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to stress the réle of individuation and to underrate the importance of 
integration in the development of the highly modifiable human organ- 
ism. If the foregoing statements of Coghill were applicable to human 
development, one would anticipate, I think, a highly synthesized 
expression of some reflexes. But Furfey, Bonham, and Sargent show 
that among seventeen responses such as the plantar and grasping 
reflexes, the inter-correlations are zero (cited on p. 143). It will 
expand the concept of Coghill too much, it seems, to assert that all 
conditioned reflexes result from individuation of total purposeful 
patterns. 

Therefore, the writer was pleased by the following conclusion of 
Carmichael: “‘in regard to the related processes of individuation and 
integration, it seems that as yet, at any rate, it is better to record as 
unambiguously as possible the responses that can be made at any 
stage [of development] rather than to attempt to fit all development 
into one formula.”’ 

This position is entirely tenable, since we do not know at what 
levels of human development true conditioning of responses is possible. 
Indeed, we need to know more precisely the nature and scope of many 
maturation-levels. Surely there is no sound reason for postulating a 
simple formula for human growth in the light of experimentation upon 
prenatal development. Nor can a simple formula be used to describe 
adequately the behavior of the newborn child. Karl C. Pratt, after 
examining 188 studies relating to the neonate, concludes that the 
newborn infant presents tremendous possibilities for development 
through ‘“‘maturation” and through “learning”; however, ‘the 
newborn infant is not as helpless as it is usually portrayed.’”’ Some 
responses are highly integrated, others appear to be unorganized. 
The possibilities for modifiability are manifold at this time. Despite 
the rather conclusive evidence presented by Carmichael and by 
Pratt, several writers continue to overemphasize the réle of hereditary 
predisposition and of maturation manifestations. Arnold Gesell, in 
another chapter, defines maturation thus: 

“Maturation is the intrinsic component of development (or of 
growth) which determines the primary morphogenesis and variabilities 
of the life cycle.’”’ Growth is sometimes used as a synonym for 
maturation; this is not satisfactory, since growth is the more com- 
prehensive term including all developmental differentiations in the 
life cycle which occur in response to external as well as to internal 
environments (p. 210). 
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Although Gesell recognizes the reciprocal relationship of heredity 
and environment, he states: “The intimacy of this relationship may 
not, however, prevent us from ascribing a priority and possibly even 
some preponderance to hereditary factors in the patterning of human 
behavior . . . Environmental factors support, inflect, and modify, 
but do not generate the progression of development. Growth as an 
impulsion and as a cycle of events is uniquely a character of the living 
organism and neither physical nor social environment contains any 
architectonic arrangements even analogous to the mechanism of 
growth” (p. 211). 

Uniformity and regularity in human development is stressed by 
Gesell, and M. M. Shirley, in a study of motor functions in babyhood, 
reports an orderly sequence from time to time: ‘‘The nature of the 
sequence indicates its dependence upon biological laws of growth” 
(p. 265). 

It is clear that these writers, through dependence upon excellent 
but highly limited studies, present a strong case for ‘“‘maturation”’ 
and for hereditary predispositions. Although the reviewer believes 
that Gesell is entirely justified in stating that the operation of an 
absolutely whimsical and fortuitous freedom for development is not 
discernible in or justified by the facts of growth, he feels that too 
much emphasis is placed upon maturation. Therefore he finds 
that the tone and the temper of the statements of Pratt, Carmichael, 
and Peterson (upon learning) are much more in keeping with his 
thought and with all the studies (limited though these all may be). 
The chapters on mental development and upon language by Florence 
Goodenough and by Dorothea McCarthy lend support to the position 
that subtle, numerous, and frequently fortuitous environmental 
factors play an enormously important réle in child growth and perhaps 
in ‘“‘maturation.” 

Phyllis Blanchard, in writing about the child with behavior difficul- 
ties, utilizes individual historical material to good effect if one accepts 
the traditional approach of the analyst. Despite the tendency to 
utilize the vague terminology of the psychoanalyst, Miss Blanchard 
reviews many related studies sympathetically and impartially. She 
emphasizes particularly the significance of emotional conflicts in child- 
hood, and she stresses the importance of many and varied environ- 
mental influences. 

The reviewer finds himself looking more and more to the environ- 
mental influences, sometimes gross and glaring, occasionally tenuous 
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and subtile, for the provenance of behavior difficulties. He believes 
that studies such as those of Shaw, the Gluecks, and others, suggest 
the force which environmental conditions working in toto have in 
providing the “area” and the conditions from which delinquents 
emerge. Of great importance in this process is the school. A recent 
study (made under the direction of the writer) reveals beyond per- 
adventure the significance of the school in fostering delinquent tend- 
encies. Mr. Lane attempted to ascertain the manifold forces which 
operate in school to effect a situation from which the St. Charles 
delinquent comes. He found that apprehended delinquents seldom 
“like” school, and infrequently succeed therein.! Furthermore, 
failure to achieve even those barren goals of attainment revealed by 
standardized tests characterized these children. And the children 
were working below the meager standards which we have come to 
expect from predictions based upon the use of psychometric measures. 
It was found that many public schools had almost no records of the 
progressive stages in the development of the delinquents who were so 
frequently peremptorily dismissed without even a gesture toward 
rehabilitation. Even more serious were the home condition and the 
general environment of these children. Poverty and parental indiffer- 
ence were ubiquitous, and the attitudes and expectations of the 
parents and teachers provided a propitious motivating background for 
a delinquent career. The reviewer has come to see, in these delin- 
quent areas, the multitudinous causes of anti-social behavior. The 
distressing socio-economic conditions set a milieu from which escape 
is almost impossible—creating environs which nourish and reward 
the very traits upon which society has put its stamp of disapproval. 
Understanding and sympathetic treatment of each child from the 
onset of his first delinquency are necessary, but a sound therapeutic 
program must encompass also alleviation of preplexing social and 
economic factors. The school, viewed as a great social laboratory, 
must play a vital part in this endeavor. Of utmost importance is it 
that the process of education be envisioned to include re-education 
of the parental and the adult groups who are responsible for the guid- 
ance of the children. No easy recourse to the methods or labels of 
descriptive psychology is a substitute for remedying the grave social 
and economic conditions which set the pattern of, and direct the 





1Lane, Howard A.: The Social and Educational Background of the Young 
Delinquent Boy. Ph. D. Dissertation. Deering Library, Evanston. 
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child to, maladjustment. Furthermore, no Adler, or Freud, or 
Rank method of individual analysis will replace studying the child 
in his social setting. 

Psychology is, I believe, retarded greatly by the tendency to over- 
simplify. And this tendency is found in several chapters of the 
Handbook. I find myself entirely unable to grasp the significance 
(or the validity) of the six fundamental appetites which Biatz con- 
fidently announces as ‘‘secondary motivating factors.” True, there 
are appetites or drives which appear at various stages of child- 
development, but their origin and usefulness in corrective endeavor 
will not, I think, be discovered by reference to this list of six. 

Special comment should be made concerning Terman and Burks’ 
provocative chapter, which presents a composite picture of the “‘ gifted 
child.”” Surely, no one can read this chapter critically without coming 
to the conclusion that ‘“‘gifted children’”’ should be identified early 
and given opportunity through special education. The unusual 
progress which Terman’s ‘‘gifted”’ have made during the period of 
their study shows clearly the obligation which society should assume 
in guiding the development of all unusually promising children. 
Several minor points in this discussion invite pause and criticism. 
For example, Terman and Burks announce that the ratio of gifted 
girls to gifted boys is 7:6 in the elementary and 2:1 in the secondary 
school. From my analysis of their Table I, I can not find the basis 
for this conclusion. Minor criticisms such as that just given do not 
detract from the significance of their presentation. I am happy to 
find that both Terman and Burks now grant that factors other than 
test-intelligence influence attainment, and that the highest peaks of 
achievement may be reached as a result, in part at least, of drives of 
the compensatory type as well as by the impelling force of innate 
endowment of surpassing degree. 

To L. S. Hollingworth were given two distinctly difficult tasks: 
the treatment of the talented and the discussion of the adolescent. 
The author was limited by the meagerness of data in these realms. 
Nevertheless, the description of the adolescent is the best brief account 
available. This should be read by all who are interested in child 
growth. 

R. Pintner, in his usual clear style, discusses ably the feeble-minded. 
Some of us believe that the vital problem now is to develop and 
evaluate experimental curricula; Pintner’s discussion shows clearly 
that barely a start has been made in this direction. The chapter is 
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carefully documented and interestingly written. Similarly, F. 
Goodenough discusses intelligence, introducing much relevant data 
and an excellent critical evaluation of weighting, scaling, and predicting 
mental growth. H. E. Jones revives the albatross ‘Birth Order and 
Intelligence.” His is without doubt the best account in print, but 
the topic scarcely seems to warrant a chapter in this book. 

In Vernon Jones’ excellent chapter on Children’s Morals, and in 
Piaget’s upon Children Philosophies, we have revealed barren areas 
to which psychologists have contributed little. Jones was forced to 
depend to a considerable degree upon studies which have used the 
present stock of character tests, and Piaget depended in no small meas- 
ure upon subjective judgment. In these fields, different approaches 
and new basic and guiding concepts are needed before real progress 
can be made. The Handbook will have performed a worthwhile 
service if psychologists, upon reading these chapters, resolve to 
explore these neglected areas of development. 

This second edition of the Handbook of Child Psychology is the most 
comprehensive survey of the literature of child psychology now 


obtainable. Every serious student of child development should, 
I think, have ready access to this volume. 











THE EFFECT OF BINOCULAR IMBALANCE ON THE 
BEHAVIOR OF THE EYES DURING READING 


BRANT CLARK 
Psychology Laboratory, University of Southern California 


INTRODUCTION 


Since the early work of Schmidt'! in 1917 there has been little 
emphasis on the binocular behavior of the eyes during reading. This 
has been true chiefly for two reasons. First, early investigators 
believed that movements of the two eyes were so nearly symmetrical 
that records of one eye indicated the behavior of the other with 
sufficient accuracy, and second, the chief interest was in determining 
the number and duration of fixations and regressions which could be 
done by observing but one eye. 

Schmidt found that during interfixation movements and especially 
during the return sweep the eyes make upward and divergence move- 
ments which necessitate downward and convergence movements 
during the fixation pauses. However, the value of this early experi- 
mental work was limited because of the low magnification of the 
photographic apparatus and the fact that the movements of each eye 
could be recorded in one meridian only. 

The possible importance of these convergence movements has been 
indicated by the recent work of Eames.* In comparing the ocular 
characteristics of unselected and reading disability groups Eames 
found that there was a greater percentage of exophoria* at the reading 
distance in the reading disability cases. Selzer'? goes so far as to 
state that eye muscle imbalance, alternating vision, and lack of 
fusion account for such reading disabilities as are not accounted for 
by general mental disability. These findings raise the problem as to 
the effect of high exophoria on the behavior of the eyes in reading. 
A priori it would appear that marked anomalies of binocular balance 
should cause definite irregularities of the eye movements and be a 
significant factor from the point of view of remedial reading. If this 
actually is the case, then, of course, the problem of producing correct 
binocular balance should be considered as a fundamental part of a 
remedial reading program. 





* The tendency of either or both eyes to turn out when fusion is prevented by 
the use of a displacing prism or when one eye is covered by an opaque screen. 
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In another paper® Eames states, ‘Exophoria and esophoria often 
result in momentary overlapping of words and letters in reading.” 
Farris? makes a similar statement and presents data which tend to 
show that students having but one eye show greater improvement in 
reading than students with ‘“‘normal” binocular vision. This is 
explained as due to the fact that these individuals do not have to make 
binocular adjustments. On the other hand he also reports that 
students with binocular imbalances show a greater improvement 
in reading than the ‘“‘normal” students. However, these findings 
can not be considered to be highly significant because of the small 
differences between the two groups. 

It was the purpose of the present study to attack this problem using 
the method of eye movement photography to determine any difference 


in the binocular behavior of the eyes of “‘normal”’ individuals and those 
having high exophoria. 


EXPERIMENTAL PROCEDURE 


1. Subjects.—The subjects used in this investigation were freshmen 
in the University.* The binocular balance of one hundred ninety-one 
men and women was measured by a method described below to 
select cases showing high exophoria. Eleven subjects, nine men 
and two women having a near point exophoria ranging from twelve 
to sixteen diopters, were chosen as the experimental group. They 
were selected from the upper nine per cent of the large group. These 
eleven cases were matched with eleven others having more nearly 
correct binocular balance (zero to two diopters of exophoria) as to 
sex and score in reading comprehension and linguistic ability from 
the Thorndike Intelligence Examination for High School Graduates, 
¥orm B. The examination was given as a part of the routine entering 
examination for freshmen. The subjects’ scores fell in the central 
ninety per cent of an unselected group. 

2. Apparatus.—The instrument used to test the binocular balance 
consisted of two three-diopter displacing prisms which were placed 





*The cooperation of Mr. P. A. Libby in obtaining these students from his 
classes in orientation is gratefully acknowledged. 

t One diopter is the angular displacement of one centimeter at a distance of one 
meter and hence is a measure of the angular rotation of the eye. It is a standard 
unit used by eye specialists. 
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aL) 5 in a frame held by the subject. The test object was a chart containing 
ki an arrow above a line of letters and was placed thirteen inches from the 
ae a? eyes of the subject. The two prisms caused diplopia, 7.e. two arrows 
Py { and two lines of letters were seen. The subject merely reported the 
ay letter to which the lower arrow pointed. The letters were so placed 

: that the subject’s report indicated the binocular imbalance directly 

: in diopters. If the subject reported a considerable amount of oscilla- 

t tion of the arrow, the midpoint of this oscillation was taken as the 
proper measure. 


i | The eye movements were photographed using the camera pre- 
v viously described by Clark.! This instrument gives records of the 
q { horizontal and vertical movements of both eyes on a single record 
q { with high magnification. The subjects read two paragraphs from 
at the introduction to Student’s Guide for Demonstration of Psychological 


; Experiments by Milton Metfessel. The length of the lines was 
i fifteen centimeters, and the material was placed approximately thirty- 
if i three centimeters from the eyes. 
The experimental procedure consisted first of a check on the original 
| measurement of binocular balance. Then the two eyes of the subject 
: were compared as to visual acuity at the reading distance to assure 
' equal vision. The subject was instructed to read the test material 
: so that he could answer questions on it. Following this the eyes 





i were photographed while the two paragraphs were read and the 
f subject was immediately tested for comprehension. The whole 
i ae procedure was the same for both groups and lasted between fifteen 
a and twenty minutes. 
Hi is RESULTS AND DISCUSSION 
Wee 1. Fizations and Regressions.—The eye movement records were 
Hit examined to determine any differences in the two groups. The average | 
Ly number of fixations per line for the experimental group was found | 
: } r to be 13.81 + 1.69 and for the control group 13.59 + 1.59. | 
a ee ‘ 
: Ht ‘a D ] 
etek: — diff. = 0.25. 
i , ! 
ihe | 
i) ay. A similar tabulation was made for the number and duration of regres- 
i hy i sions per line. These included only the regressions made after the : 
hip first forward movement and were differentiated from “initial regres- ¢ 
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sions” which included all of the regressions prior to the first forward 
movement in the line. These regressions were tabulated because of the 
possibility that the exophores having a binocular imbalance might 
make more regressions in becoming oriented at the beginning of the 
line. For a similar reason the time prior to the first forward move- 
ment was also determined. These findings together with the average 
reading time per line for the two groups are tabulated in Table I. 


Taste I.—A Comparison oF THE OcuLar Benavior or “NorMAu” INDIVIDUALS 
AND THOose Havine Hies BinocuLtaR IMBALANCES 





Experimental 


group Control group | D/e diff. 





Average number of fixations per line.| 13.81 + 1.60 | 13.59 + 1.59 0.25 





EEE ETE TEC TTT TT reer eT 1.00 + 0.63; 0.85 + 0.50 0. 46 





Average number of “Initial regres- 
oo s0cdee seaweed 1.15 +0.31] 1.04 +0.29 0.65 





Average duration of regressions in 
EO eee were ee 5.24 +0.88| 5.42 + 0.87 0.37 





Time prior to initial forward move- 
i En in edes édukbies-a 12.38 + 0.96 | 12.08 + 1.35 0.40 














Average reading time per lineinsec..| 3.35 +0.59| 3.74 + 0.59 | 1.00 





These data show quite clearly that there is no significant difference 
between the two groups as far as these factors are concerned with the 
possible exception of the average reading time per line, and that 
difference is small if there is any difference at all. In other words 
individuals having high binocular imbalances of the divergence type 
and of the extent indicated make, on the average, practically the same 
number of fixations and regressions as do individuals having what is 
considered to be a more ideal binocular balance at the reading distance. 
The average reading time per line is not more than slightly different, 
and practically the same time and number of movements are required 
at the beginning of the line for both groups. It should be pointed 
out, however, that these measurements were taken over a short period 
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of time, and the factor of fatigue could not have influenced the 


results. 


2. The Binocular Factors in Reading.—The photographic records 
were also examined to determine the magnitude and duration of the 


divergence movements at the beginning of each line. 


The results 


of these measurements for the two groups are summarized in Table II. 
At should be pointed out that these movements were just the reverse of 
_ those reported by Schmidt." It was found that the eyes in moving 
from the end of one line to the beginning of the next made definite 


convergence adjustments which required divergence adjustments at 


the beginning of the line. This over convergence probably results 
because the movement involves a positive focusing adjustment. 
This normally involves convergence and in reading results in an 
over convergence because the eyes are already converged for the 


proper distance. 


TasBLe I].—A CoMPpaRISON OF THE DIvERGENCE MOVEMENTS Mape sy “Nor- 
MAL”’ INDIVIDUALS AND THOSE HavinG Hicu BiInocuLaR IMBALANCES 














Experimental Control group | D/e diff. 
group 
Duration of divergence movements 
I 6 cane nsameonene 2.41 +0.35 | 2.34 + 0.55 0.28 
Extent of divergence movements....} 39.1 + 10.7’ | 30.0 + 9.5’ 1.70 











It will be noted in Table IT that the difference in time required for 
_the divergence movements was only 0.07/25 second and was not 
i statistically significant. In other words the exophores required no 
\more time to make the divergence movements than did the “normal” 
group within the limits of the test. The average time for these move- 


ments was 950 with a range from 40 to 240c. 


It can be seen that 


these movements are slower than typical saccadic movements. These 
results are similar to the findings of Dodge‘ and Judd? although their 
subjects made movements between objects placed at different distances. 

The average size of the divergence movements at the beginning of 
the lines was 9.1 minutes larger for the control group. This difference 
is probably significant, but in spite of this the time required to make 


them was practically the same for both groups. 


This raises the ques- 


tion of the importance of exophoria in reading. As has been shown 
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above the reading time, the number of fixations and regressions, 
et cetera were not significantly different between the two groups- 
The importance of exophoria must be in the larger convergence and 
divergence movements made by the exophoric individuals. It is 
a well known fact that convergence or fusion movements of any sort 
are very fatiguing, and the importance of the larger divergence move- 
ments of the exophores becomes increasingly important when it is 
pointed out that each subject made a divergence movement at the 
beginning of every line. 

During the return sweep the eyes also moved up slightly requiring 
a downward as well as a divergence movement at the beginning of 
each line. The extent of this vertical movement ranged from forty-six 
minutes to one hundred thirty-five minutes with an average of sixty- 
five minutes. Divergence movements were also made at the beginning 
of each fixation within the line. These movements as a whole were 
too small to be measured accurately enough for comparative purposes, 
but they were of the order of ten minutes on the average and probably 
larger for the exophoric group as a whole. 

It has been mentioned above that according to Eames exophoria 
often results in momentary overlapping of letters and words in reading. 
None of the subjects reported any doubling of the print in spite of the 
fact that divergence movements amounting to as much as 3 degrees 
were made at the beginning of the line, and the time required to 
complete these movements was as much as one-fifth second. Ocular 
rotations which would equal the angular size of a letter (approxi- 
mately twenty-five minutes) would have been easily*distinguished on 
the records, but no such movements occurred. In other words the 
subjects made no ocular movements which could possibly cause 
diplopia with the possible exception of those at the beginning of the 
line, and none were reported there in spite of the large divergence 
movements. However, it is entirely possible that such movements 
would occur at the end of long reading periods or when fatigue is 
present. 

One factor which seems clear from the preceding results is that any 
excessive reading fatigue which may result from a condition of exo- 
phoria is certainly not a muscular fatigue. The experimental group 
made divergence movements on the average only 9.1 minutes larger 
than the control group, and it is inconceivable that a difference of this 
amount could cause any additional fatigue. Thus it is evident that 
any such fatigue is due definitely to sensory processes in certain 
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respects similar to flicker fatigue. It is a well known fact that fusion 
movements to prevent diplopia when made under conditions similar 
to ‘“‘eye exercises’’ cause fatigue to set in very rapidly. In a similar 
way the exophores made excessive fusion movements at the beginning 
of each line read and probably during each fixation to a lesser degree. 
These fusion movements could only be expected to cause a similar 
fatigue. This problem of fatigue is being attacked in this laboratory 
at the time of this writing. 

The data were also examined to determine whether or not the 
magnitude or time of the divergence movements at the beginning of 
the lines varied between the two eyes. The ocular dominance of 
twenty of the subjects was determined using Miles’ V-scope technique. '° 
It was found that for twenty-five per cent of the subjects the dominant 
eye completed the divergence movements first a significant number of 
times, the non-dominant eye in fifteen per cent of the cases and 
there was no significant difference in the remaining sixty per cent. 
In other words, in the majority of the cases one eye would arrive at the 
fixation point first as many times as the other. These findings are 
similar to those of Clark? who studied convergence movements during 
stereoscopic vision. 

The extent of the divergence movements was also tabulated. An 
examination of the results showed that the dominant or sighting 
eye made divergence movements which were significantly smaller 
in sixty-five per cent of the cases while the reverse was true in only 
five per cent of the subjects. Thirty per cent of the readers exhibited 
no significant difference between the two eyes. It would appear from 
these results that the sighting eye is unimportant in relation to the 
time required to complete a divergence movement, but the non- 
dominant eye on the average makes larger convergence movements 
than the dominant eye. 

These results tend to substantiate the conclusion of Crider* that eye 
muscle imbalance, visual fusion, and ocular dominance are closely 
related. On the other hand they do not substantiate his statement 
that the dominant eye has the most efficient musculature. The 
results of this study show that there is a great similarity in the general 
behavior of the two eyes. The differences seem to be due definitely 
to sensory rather than to motor factors. Crider’s limitation of the 
muscular imbalance to but one eye is also not in keeping with these 
results which show that the greater convergence movements definitely 
alternated between the two eyes in every case in spite of the fact 
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that one eye predominated. Also, by definition, heterophoria is 
fundamentally binocular. 


SUMMARY 


A comparison of the eye movement records of individuals having 
‘“normal”’ binocular balance with those of an exophoric group lead to 
the following conclusions: 

1. There was no significant difference between the two groups in 
regard to the number and duration of fixations and regressions per 
line as shown by Table I. 

2. There was no significant difference in the time required to 
complete divergence movements at the beginning of the lines. 

3. When the dominant and non-dominant eyes were compared 
as to the time required to complete the divergence movements, it was 
found that there was no significant difference. 

4. A relation between binocular imbalance, fusion movements, 
and ocular dominance is suggested by the fact that sixty-five per cent 
of the subjects made significantly larger divergence movements with 
the non-dominant eye. 

5. The exophoric group made greater divergence movements 
at the beginning of the lines. These divergence movements probably 
cause enough excessive reading fatigue to be of considerable impor- 
tance from the point of view of remedial reading. 
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AN EXPERIMENTAL STUDY OF THE DEVELOPMENT 
OF CERTAIN ASPECTS OF REASONING 


W. H. PYLE 


Wayne University 


Before curricular material can be properly placed with reference 
to school grades, we must know what kind of reasoning problems 
can be solved by pupils at the various grade-levels. 

The study here briefly reported is an attempt to discover by 
experimentation what type of arithmetical problem can be solved 
and what kind of literary material can be interpreted by pupils in 
various school grades. 


EXPERIMENT I: ARITHMETICAL REASONING 


Material and Method.—After considerable preliminary experi- 
mentation on the types of problem to be used and the manner of 
statement, the list of ten problems given below was determined upon. 
The ease of solution of a problem depends somewhat on the way in 
which the problem is stated. It was thought best to adhere to no 
standard form of statement. On the contrary, while the statement 
was always simple, there was considerable variation from problem 
to problem. Therefore a careful reading and analysis of the problems 
by the pupils were necessary. 

The method was to give each pupil a printed list of the ten prob- 
lems. The pupil indicated his answers on a separate sheet of paper 
on which was also written other necessary data as to age, grade and 
sex. ‘Ten minutes were allowed for the solution of the problems. 

The problems were given in certain Detroit and suburban schools 
from the third grade through the twelfth, in most grades to approxi- 
mately one thousand pupils. In grades ten, eleven, and twelve, 
only a small number, about one hundred, were tested in each grade. 


ARITHMETIC PROBLEMS 


Arranged in order of increasing difficulty. After each problem 
the grade in which it was first solved by three-fourths of the pupils 
is indicated. 


1. If one apple is worth two pennies, how many pennies are two apples 
worth? Third grade. 
2. If John has four marbles and Jim has two marbles, how many marbles 
do they both have together? Third grade. 
539 
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3. If John has four marbles and Jim has two marbles, how many more 
marbles has John than Jim? Third grade. 


4. If a pencil costs two cents, how many pencils can you buy for eight 
cents? Fourth grade. 


5. If John is four years old and Mary is two years old, John is how many 
times as old as Mary? Fourth grade. 

6. If two men can paint a house in four days, how long will it take four 
men to paint it? Fifth grade. 

7. If you can buy four apples for eight cents, how many apples can you 
buy for ten cents? Sixth grade. 

8. If one-half apple is worth one penny, how many apples are worth four 
pennies? Sixth grade. 

9. If it takes a cup of sugar to make candy for four people, how much 
sugar will it take to make candy for six people? Stzth grade. 

10. If two apples are worth four pennies, how many pennies are three 

apples worth? Seventh grade. 


THE RESULTS 


The number of pupils giving the correct answer to each problem 
in every grade was determined and the percentage which this number 
was of the whole number of pupils, calculated. In Table I the per- 
centage of pupils giving the correct answers is indicated by grades. 
The lowest grade in which at least three-fourths of the pupils gave the 
correct answers is indicated by black-face type. 











TaBLeE [| 
Grade 
Problem 

3 4 5 6 7 8 9 10 11 12 
1 90.0) 93.0 | 97.0 | 98.0 | 98.5 | 97.5 | 99.0 | 99.0 {100.0 {100.0 
2 86.5) 92.5 | 95.5 | 98.0 |100.0 {100.0 | 99.5 | 98.0 | 98.0 | 99.0 
3 79.5) 90.5 | 96.0 | 98.0 | 99.0 | 99.0 | 98.5 | 98.5 | 99.5 | 98.0 
4 66.5) 80.5 | 90.5 | 95.0 | 98.0 | 98.5 | 99.5 {100.0 | 99.5 | 97.5 
5 72.0) 78.6 | 86.0 | 90.5 | 94.5 | 95.0 | 95.0 | 96.0 | 92.5 | 94.5 
6 58.5) 69.0 | 78.0 | 87.5 | 90.0 | 92.0 | 94.0 | 91.5 | 93.0 | 92.0 
7 46.0} 56.5 | 72.0 | 85.6 | 91.5 | 95.0 | 95.5 | 97.5 | 94.5 | 98.0 
8 47.0| 62.0 | 72.5 | 81.0 | 88.0 | 91.5 | 92.5 | 88.5 | 87.5 | 89.5 
9 15.0} 36.5 | 59.0 | 79.0 | 87.0 | 93.5 | 93.5 | 93.0 | 96.0 | 95.0 
10 46.5) 49.5 | 57.0 | 71.0 | 76.6 | 86.0 | 87.5 | 92.0 | 88.5 | 88.5 



































Problems 1, 2, and 3 can be solved by pupils in the third grade. 
They are very simple problems involving multiplying in the first, 
adding in the second, and subtracting in the third. 
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Problems 4 and 5 can be solved by pupils in the fourth grade. 
Problem 4 is a matter of finding how many articles can be bought for a 
certain small amount of money when the cost of one article is known. 
Problem 5 involves a very simple comparison. 

Problem 6 can be solved by pupils in the fifth grade. It is a very 
simple problem in inverse ratio. 

Problems 7, 8, and 9 can be solved in the sixth grade. In problems 
7 and 8 the unit of cost must first be determined. Problem 9 is a 
very simple problem in changing a recipe for a different number of 
people. This problem is especially difficult for pupils in the third 
and fourth grades. 

Problem 10 proved most difficult and could not be solved by three- 
fourths of the pupils till the seventh grade was reached. It is a matter 
of going from the cost of two articles to the cost of three. 


SEX DIFFERENCES 


A proper study of sex differences should be based on age. How- 
ever, for practical purposes, it was thought best in this study to base 
the result on grade. In problems 1, 3, 5, and 10 there is not more 
than one per cent of difference in the results for boys and girls. In 
problems 4, 6, 7, 8, and 9 the boys are slightly better than the girls. 
The largest difference is on problem 9, the recipe problem, 4.2 per cent 
more boys than girls solving the problem correctly. 


INTERPRETATION 


A fair interpretation of the results of this study would be somewhat 
as follows: Third-grade children can solve arithmetic problems involv- 
ing finding cost of a few articles when the cost of one is known, the 
number being small; and very simple problems involving finding sum 
or difference. In the fourth grade the pupils can solve very simple 
problems involving finding how many articles can be bought for a 
small amount of money when the cost of one is known, and problems 
involving very simple comparisons. Fifth-grade pupils can solve 
simple problems in inverse ratio. 

In the sixth grade, pupils can solve simple problems in which the 
unit of cost or comparison must first be determined. If such problems 
are a trifle more difficult, as in the tenth, they cannot be solved by 
three-fourths of the pupils till the seventh grade is reached. 

While some pupils in the third grade can solve the most difficult 
problems in the list, not till the tenth grade is reached is any problem 
solved by all pupils. 
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A fair representation of the development of reasoning ability in 
arithmetic is shown in Graph I, which is the grade development curve 
for problem 10, which, as measured by the grade in which it can first 
be solved, is the most difficult. 

From my own observation, and from talking to many teachers of 
arithmetic in the elementary schools, I have had the impression that 
the problems involving reasoning found in the arithmetics in the 
elementary grades, are too difficult. The results of this experiment 
confirm me in that opinion. In this study, no analysis was made of 
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the processes used by the pupils in finding the answers to the problems. 
Such a study shall be our next endeavor. 


EXPERIMENT II: THE INTERPRETATION OF SIMPLE LITERARY 
SELECTIONS 


Material and Method.—Five selections from very simple poetry 
constituted the material, and ‘‘mutiple choice,” the method. The 
pupils were given the printed selections with five statements on each 
selection, one statement being the correct answer to the question. 
No time limit was set, the pupils being given all the time needed. 
As a preliminary to the experiment, the selections were submitted to 
about eighty students in each grade from the third to the sixth and 
the pupils wrote out their interpretations. From their written 
answers the statements used in this experiment were selected. 


LITERARY SELECTIONS 


Arranged in order of increasing difficulty, with grade in which 
they were correctly interpreted by three-fourths of pupils indicated. 
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I 


UPS AND DOWNS 


Our sleds are slip’ry things, 

They w-h-i-z-z down hills of snow. 
They will not stop a bit, 

They go! and go! and go! 


They reach the bottom then— 
They stop! and stop! and stop! 
Till taken by their fronts, 
And d-r-a-g-g-e-d up to the top! 
They all can run down hill 
So easily. Now then— 
Why can’t they learn to turn 
And run up hill again? 

Fifth Grade 


Question.—Why can’t the sled run up hill? 
1. The hill is too slippery. 

2. The sled is too heavy. 

3. The sled has no wheels. 

4. The hill is too steep. 

5. The sled has no power to go up by itself. 


II 


ANNA’S BANANA 


“T ate a banana,” said Anna, 

“‘T began at its top, and ate down! 
I ate with my very best manner, 
Not a bit of it fell on my gown.”’ 


‘But its skin, I’d forgotten, entirely, 
And it fell on the ground at my feet, 
Then when I got up, I sat down, 
Right down, on the dusty old street!’ 

Sizth Grade 


Question.— What made Anna sit down? 

. The banana made her sick. 

. She wanted to pick up the banana skin. 

. She slipped on the banana skin. 

. She forgot to throw it in the garbage can. 
. She wanted to eat the banana. 


ar Wh 



































The Journal of Educational Psychology 


III 
BALLOONS 


If you throw other things 
In the air (as we do) 
They’ll come hurrying down 
And return straight to you! 


But Balloons, when they’re free, 

Sail away to the sky. 

They fall up and not down! 

Will you please tell me—why? 
Seventh Grade 


Question.—Why do they fall up? 

1. Because the wind blows things. 

Because they are lighter than air. 

Because they are on strings. 

Because they are round. 

Because they have some kind of electricity in them. 
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IV 
OH, THE WEE GREEN APPLE 


I ate a small green apple; 

It tasted good, and yet— 

I wish that small green apple 

And I had—never met! 
Seventh Grade 


Question.—Why does he wish that he had never met the apple? 
The green apple made him sick. 

The apple was sour. 

The apple had worms. 

Because he was not hungry. 

Because green apples are not good for children. 


or > 99 bo 


V 


THERE'S A CITY 


There’s a City that is 

And that isn’t, 

That nobody sees 

Except me! 

If you don’t know the way to 
Look at it, 

’Tis a City you 
Never will see! 
Eighth Grade 
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Question.— What city is this? 
1. Heaven. 

2. Mars. 

3. Fairyland. 

4. The city you imagine. 

5. Toyland. 


THE RESULTS 


The results, in terms of percentage of pupils giving correct answers, 
are given in Table IT. 


Tase II 
The numbers represent the percentage of pupils giving the correct interpreta- 
tion of the selection. The grades in which three-fourths of the pupils gave correct 
answers are indicated by black-face type. 











Grade 
Selection 
3 4 5 6 7 8 
I 43.5 67.5 78.0 86.5 87.5 93.5 
II 32.0 63.5 71.0 83.0 87.5 92.5 
III 28.0 46.0 59.5 68.5 79.0 87.5 
IV 33.0 44.5 56.5 67.0 76.5 83.5 
V 37.5 44.5 54.5 66.5 70.0 79.5 























It will be seen that, using the 75 per cent criterion as with the 
arithmetic problems, selection I can be interpreted in grade five; 
selection II, in grade six; selections III and IV in grade seven; and 
selection V, in grade eight. 

The selections do not have quite the same definiteness as was the 
case with the arithmetic problems. Selection I involves a considera- 
tion of the effect of gravity. Selection II involves determining the 
cause of an accident. Selection III is a matter of determining why 
certain things rise in the air. It will be noted that this selection 
was most difficult for third grade pupils. Selection IV has to do with 
the effects of eating green apples. It is not merely a matter of experi- 
ence with green apples but rather catching the meaning and significance 
of the lines. Selection V, the most difficult, involves the discernment 
that the city talked about is a matter of individual imagination. 

The course of development in ability to interpret such literary 
material as that used in the experiment is shown in Graph II. The 
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data for the graph were obtained by combining the percents for the 
various selections. 

The results of this experiment will not enable an author of text- 
books in reading to place properly the different selections, although 
they may be at least suggestive. Readers, scientifically constructed, 
should have the grade placement of every selection determined in 
accordance with the procedure of this experiment. The vocabulary 
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of a selection has little to do with its proper grade-location. A twelfth- 
grade selection, for example, may be written with first-grade vocabu- 
lary. Proper placement of selections depends to a great extent on 
difficulty of interpretation of the thought. 

The graph showing development is significant. Note the great 
and rapid development from the third grade to the eighth. The results 
of this interpretation experiment may perhaps be profitably used by 
teachers in measuring the literary reasoning capacity of their pupils. 
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ABILITY OF STUDENTS TO ESTIMATE THEIR 
GRADES ON A MULTIPLE CHOICE EXAMINATION 


MILTON L. BLUM 
College of the City of New York 


Despite the numerous publications on the many phases of judgment 
and estimation there is a considerable lack of material pertaining to the 
specific subject of this study; to wit: the prediction of grades on a 
multiple choice examination. 

Students taking a course in General Psychology at the College of 
the City of New York are required to take a final examination. It 
covers the term’s work and consists of one hundred multiple choice 
questions. 

After the papers were distributed but before the students actually 
started to answer the questions, the following instructions were given: 
“You will notice that at the end of the examination you are asked to 
guess your mark. Seriously attempt to estimate your score on the 
basis of one for each correct answer. Your estimated or guessed 
mark will have no bearing on your actual mark.” Just before the 
close of the examination period, students were reminded to fill in the 
guessed mark. 

This procedure was repeated for three consecutive terms. The 
first term there were one hundred fifty-three students in the group; 
one hundred forty-one of these attempted to guess their grade. The 
serond group was comprised of one hundred seventy-six students of 
wh . one hundred sixty estimated their grade. In the third group 
one hundred seventy-nine out of one hundred ninety-seven attempted 
to estimate their grade. 

An explanation of the fact that every student did not attempt to 
estimate his grade is given by the following two reasons. First, 
students in the excitement of taking the examination forgot to estimate 
their mark upon finishing (many would return stating that they 
forgot to fill in their guessed mark). Second, (and in a small minority 
of cases) students would be reluctant to guess, as they said there was 
no way of knowing what score they would obtain. 

The end term examination was used as the basis for the experi- 
ment because it can be assumed reasonably that every one taking the 
examination has done a considerable amount of studying and is 
sincere in his desire to obtain the highest possible grade. As the test 
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result is important for each student, the writer is of the opinion that 
each subject attempts to do his best. Therefore, the results should 
be much truer than an ordinary laboratory study. 


RESULTS 


The coefficients of correlation, computed by the product-moment 
method using the Cureton-Dunlap charts, between the actual grades 
and estimated grades are given in Table I. 


TaBLE ].—CoEFFICIENTS OF CORRELATION BETWEEN ACTUAL AND ESTIMATED 











GRADES 
r _ ¢ 
i Reins aid Ce awing heh wan eae ses 44 on + .221 .08 
es reining Sneak Lhe a het adhe pn on 0 + .455 .062 
2.54 is Kae Rbieds bent aee bee ekeesee + .429 .061 








From the data, therefore, it can be concluded that students, while 
they do not estimate their grade accurately enough for prediction 
purposes, do show a positive tendency to estimate their grade. Upon 
investigating whether there exists a reliable difference between the 
correlations for the three groups, it is found that Dif./o difference 
between r, and r,,, is equal to 2.3. Forr, andr,,itis .24. Therefore, 
it can be seen that the differences that exist are not statistically 
reliable. 

By comparing the averages of the three groups it is found that 
there is a general tendency on the part of the whole group to over- 
estimate their score as is shown in Table II. 


TaBLE I].—AVERAGES AND STANDARD DEVIATIONS OF ACTUAL AND ESTIMATED 








GRADES 
Average grade o Average o 
estimation 
NE pais on wn vee ow acd . 74.3 8.0 81.5 6.8 
ee 70. 56 9.24 78.85 6.54 
Es a aicak> 6:6 & 0 ¥0cks 66.05 9.79 75.58 6.77 

















From Table II it can be assumed that as the examination tends to 
become more difficult there is less likelihood of estimating correctly 
and there is a slightly greater tendency toward overestimation. 
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Upon investigation it is found that the difference between the estimated 
and actual mean is statistically reliable. It should be noted that the 
difference between the standard deviations of the actual and estimated 
average for each group is a statistically reliable one. In each case the 
o of the estimated average is reliably smaller. 

The next topic to be investigated is whether students with high 
grades tend to underestimate as is generally believed. For purposes 
of definition all marks more than lo above the average are adjudged 
high. Table III depicts the results. 


TaBLE III.—ComPparRIsoN OF ACTUAL AND EsTIMATED AVERAGE OF STUDENTS 
OBTAINING H1iGgH GRADES 








Number Actual average Estimated average 
Gil beens baa 21 85.6 84.8 
iia oa 6s Qeten ton 25 83.7 84.5 
rere 24 79.6 81.0 
DE ibbiss she badueee Bs (82.8) (83.3) 














It can readily be seen that people actually obtaining high grades 
do not underestimate their grades as is popularly believed but cor- 
rectly estimate their grades within the limit of reliability. 

Conversely, it is found that those students making low grades 
do not estimate their grades correctly but greatly overestimate their 
results. In this experiment low grades were defined as those more 
than lo below the average. Table IV shows these results. 


TaBLE I1V.—CoMPARISON OF ACTUAL AND ESTIMATED AVERAGE OF STUDENTS 
OBTAINING Low GRADES 








Number Actual average Estimated average 
Er 24 61.3 80.6 
ih hak inxi eeinen 22 56 74.3 
Ere 26 50.6 69 
IN cain acs os ninie.@ hei y (55.8) (74.4) 














The last problem to be investigated was whether those people who 
estimated their grades as high actually make high grades and whether 
those people who estimate their grades as low actually make low 
grades. 
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This differs from the preceding problem in which it is seen that 
people who actually made high grades could estimate their grades 
correctly, while those who made low grades greatly overestimate their 
marks. A similar procedure was used. High estimated grades were 
defined as those more than lo above the estimated average and low 


grades are those below minus le. Tables V and VI summarize the 
results. 


. Taste V.—CoMPARISON OF ESTIMATED AND ACTUAL AVERAGES OF THOSE 
StupENTs WuHo Estimate GRADES AS HIGH 





Number | Estimated average | Actual average 





Meshing sanaed pore 19 91.3 77.7 
ESR ERE Sve 21 87.7 76 
EES Seer 28 85 69 
Ee Pe rs (87.5) (73.6) 














It can be seen from Table V that those students who estimate 
their grades as high considerably overestimate. With reference to 
those students who estimate their grades as low, Table VI shows that 
they actually do obtain low grades. 


TaBLeE VI.—CoMPARISON OF ESTIMATED AND ACTUAL AVERAGES OF THOSE 
Stupents Wuo Estimate Grapes as Low 




















Number | Estimated average | Actual average 
Es sit wtweeedea ee 18 69.5 72 
rr 26 67.6 65 
I ie ails, 9a 40:6 ids ceed 26 60.3 60.8 
cen bie. bch hhneen in (65.3) (64.8) 
CONCLUSIONS 


Briefly the specific conclusions of this study on the ability of 
students to estimate their grades on a one hundred question multiple 
choice examination in general psychology are: 

1. Students tend to estimate their grades correctly on a multiple 
choice examination. Although the correlation is not high enough for 
prediction purposes there is a positive tendency to be correct (median 
r.43). ! 


2. A group generally overestimates its grades (median overestima- 
tion 8.2 points). 
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3. As the examination becomes more difficult there is a slightly 
greater tendency toward overestimation. 

4. A reliable difference exists in the actual and the estimated co; 
the estimated o being reliably smaller. 

5. People actually making high grades tend to estimate their 
grades correctly. 

6. People actually making low grades greatly overestimate. 

7. Students who estimate their grades as high greatly overestimate 
their grades. 


8. Students who estimate their grades as low actually do obtain 
low grades. 
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SPEARMAN-BROWN FORMULA APPLIED TO RATINGS 
OF PERSONALITY TRAITS 


EDWARD L. CLARK 


Northwestern University 


Can the Spearman-Brown formula for predicting the reliability of 
a test of increased length be used for predicting the augmented relia- 


bility of average ratings when the number of judges is increased? In 


the formula 7, = 7 z _ Ty’ * is the reliability of the test of unit 





length, n is the length of the augmented test in terms of the test of 
unit length, and r, is the predicted reliability of the augmented test 
of n length. This formula has been found to work fairly well if the 
added test items are similar in nature to those already in the test. 
Defining the reliability of a single rating, or judgment, as the corre- 
lation which numerous pairs of single ratings on any group will show 
(and not as the self consistency of judges rating persons on two sepa- 
rate occasions), we shall expect the S-B formula, if applicable, to 
predict the correlations to be found between the average of two 
ratings and two others or of three ratings and three others from the 
known reliability of a single rating. 

In order to make an empirical test of this formula, we used ratings 
on five personality traits of three hundred male students made by six 
raters per student. The six ratings were divided by chance, three 
being used to determine relationships between pairs of single ratings 
and three being put aside. By determining the coefficient of corre- 
lation between single ratings and using the formula, it was assumed 
that we could, if the S-B formula is applicable, determine the relia- 
bility of the three ratings taken together; that is, predict the corre- 
lation of these three ratings with three others, (those that were first 
put aside). The correlation of three original ratings with three others 
may be taken as the reliability of three ratings just as we take the 
correlation of the results of Form A and Form B of a test as being 
the reliability of one form of the test. This can readily be seen to 
be a more rigid and natural test of the formula for prediction than 
would have been the case if all six ratings on each student had been 
used to determine the initial correlations, ri_;, and again used to find 
the correlation between the first three ratings and the second three.’ 





1 Remmers, H. H., Shrock; N. W., and Kelly, E. L.: ‘‘ An empirical study of the 
validity of the Spearman-Brown formula as applied to the Purdue Rating Scale.”’ 
Jour. of Educ. Psychology, Vol. XVIII, 1927, pp. 187-195. 
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TaBLe I 
In this table are shown the coefficients of correlation between the average of 
the first three ratings with the average of the second three, between single ratings 
of the first three, and between single rating which would, by the use of the S-B 
formula, predict the obtained correlations between three ratings with three ratings. 
Correlations for each of the five traits are shown. Probable errors for the obtained 
r’s between single ratings are approximately +.047. 

















Traits I II III IV V 
i a .2942 | .4452 | .3950 | .3358 | .4262 
ee ai oe hehe bie Sh ied .0908 | .2031 | .2392 | .1631 | .2041 
r}-1 necessary to predict rs_3.......... .1220 | .2111 | .1775 | .1442 | .1985 
Difference between obtained r;_; and 

sé cian gnenbndeeeee® .0312 | .0080 | .0617 | .0189 | .0056 











The nine hundred possible pairs of rating available for each of the 
five personality traits! were first used in getting a measure of the 
relation of one rating to another, the reliability, as here considered, 
of single ratings and designated in the table as ‘‘Obtained 7,_;.”’ In 
Table I are to be seen also the coefficients of correlation which existed 
between the first three rating with the second three (three hundred 
pairs) for each trait. Instead of raising the first coefficients by use 
of the S-B formula and comparing the predicted values with those 
computed, the formula was rewritten so as to determine the coefh- 
cients of correlation for pairs of single ratings which would predict 
the actually obtained ‘‘three with three” coefficients. These are 
given in the third row of the table and may be seen at a glance to be 
very similar to those coefficients obtained from nine hundred pairs of 
single ratings. In only one case out of five is one of these ‘‘necessary”’ 
coefficients beyond one probable error of the obtained 7r;_; and the 
average difference between the obtained and “necessary” r’s is less 
than .5 of one probable error of the obtained r:_;. Since this differ- 
ence is much less than chance causes might well make it, we conclude 
that the data of Table 1 indicate that the S-B formula may be used 
to predict the reliability of at least three ratings. 

As an additional test of the formula and to get empirical evidence 
as to the number of pairs of ratings necessary in the original corre- 





1 The traits rated may be designated as personality, persistance, intelligence, 
forcefulness and reliability respectively. A reproduction of the graphic rating 
scale used may be found: Hopkins, L. B.: ‘‘Personnel Work at Northwestern 
University.’’ Journal of Personnel Research, Vol. I, 1922, pp. 277-288. 
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lations, coefficients were computed with twenty-five, seventy-five, 
one hundred fifty, two hundred twenty-five, and four hundred fifty 
pairs of single ratings. As shown in Table II below, three hundred ten 
correlations were computed and by means of the S-B formula raised 
to predict the relationship of three ratings with three ratings. Again 
using as criteria the actual correlations obtained between ‘‘three and 
three,’”’ the table shows the percentage of each group of predicted 
coefficients falling within designated limits of the criterion coefficients. 
For example, of those correlations based on twenty-five pairs 6.7 per 
cent yielded predictions which were within .05 of the obtained r;_,, 
ten per cent were within .10 and 26.7 per cent within .15 and 33.3 
per cent were above the criterion coefficients while of those corre- 
lations based on seventy-five pairs 18 per cent of the predictions were 
within .05, twenty-eight per cent within .10, etc. 


Tas_Le II.—SHOwWING THE PERCENTAGES OF CORRELATIONS BASED ON ParRs oF 
SincteE Ratincs Waich WHEN Ralisep BY SPEARMAN-BROWN FoRMULA 
Wiu Fai WIiTsin DesiGNnaTeD Limits oF CRITERION COEFFICIENTS OF 
CoRRELATION OBTAINED BETWEEN THREE RATINGS AND THREE 











RATINGS 
Percentages of augmented 
‘ i ithin designated 
No. of pairs of |” Fe ag d below. |Above criterion| No. of r’s used 
single ratings used coltasten cnuiiitendte coefficients, to determine 
for each r percentages percentages 
.05 10 15 
25 6.7 10. 26.7 33.3 60 
75 18. 28. 50. 38. 50 
150 26.7 41.7 58.3 53.3 60 
225 32.5 57.5 77.5 67.5 80 
450 38.3 71.7 85. 68.3 60 




















It will be seen that as the number of pairs of single ratings increases, 
the percentage of predicted coefficients falling within designated limits 
of the five criteria also increases. While it is difficult to estimate the 
most likely percentage which chance causes alone would give (we do 
not know the true relationships existing between the obtained ‘‘ three 
with three” ratings), we should expect the percentages to fall into 
sequences such as shown in this table. These sequences may be 
taken as indicating that the formula is applicable to this type of data 
except as chance causes interfere. In addition, as shown in the fifth 
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column, the percentages of predictions above the criterion coefficients 
average only a little more than fifty per cent. While the chance 
percentage is more likely fifty than any other, the ratios for small 
numbers of correlations such as fifty or sixty may be expected to 
fluctuate. This table further indicates that when predicting the 
reliability of three ratings from the computed reliability of a single 
rating, especially in rating scale work where such coefficients are 
almost certain to be rather low, practical results may be obtained 
only when several hundreds of pairs of single ratings are used. 
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BOOK REVIEWS 


JoHnson O’Connor. Psychometrics. Cambridge: Harvard Univer- 
sity Press. Pp. XXXIV + 292. 


This book is divided into three parts. Book I is a discussion, at 
the empirical level, of the sampling error of measures of central 
tendency and variability. O’Connor finds that, with data from 
psychological tests, the median is frequently more reliable than the 
mean. Sometimes the average (mean) of the measures between 
certain percentiles is more reliable than either mean or median. One 
might expect the form of the distribution to have a bearing on these 
results; the original distributions, however, are not given in the text; 
and all that the author states about them is merely that they are 
frequently skewed. An error may be noted in the tables on p. 64. 
By using formulas (8) and (9) on pp. 40 and 41, O’Connor evidently 
assumes that the standard error of any mean (whether the mean of 
all the scores of a distribution, or the mean of the scores between 
stated percentiles) is given by the single expression, ¢/+/N (p. 40). 

Book II is a discussion of the reliability of measurement. The 
author at great length, with much severity, and with every appearance 
of supposed originality, criticizes the reliability coefficient because—it 
varies in different ranges! O’Connor apparently overlooks the well- 
known fact that, if an absolute index of error of measurement is 
desired, one may use the standard error of measurement (which 
remains constant in different ranges). Instead, O’Connor presents, 
without proof or demonstration, some questionable standard error 
formulas of his own (pp. 115 and 116). 

Book III is devoted largely to a demonstration of how a multiple- 
item test may be “purified” by item-analysis. This familiar process 
of improving the validity and internal consistency of a test is here 
presented as a method of arriving at psychological ‘‘elements”’ (com- 
parable to the chemical elements). O’Connor believes that after a 
test has been “purified,” the test measures one “element” only. 
The author here seems to have been captivated by the term “ purify,” 
and to have taken it at its face-value. Regardless of how the term 
‘“‘element”’ is defined, a test “‘ purified” by item analysis can hardly be 
elementally “‘pure,’”’ z.e., free from all heterogeneity. For one thing, 
heterogeneity is indicated by the circumstance that the item-values 
of the individual items in a “‘purified’”’ test are far from uniform; in 
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fact, the item-value of the best item, in each of O’Connor’s tests, 
differs from that of the poorest accepted item, much more than the 
value of the poorest accepted item differs from that of the average 
rejected one. A second consideration is the fact that the inter- 
correlations among the items included in a ‘“‘ purified’”’ test are, in all 
likelihood, considerably below 1.00, even if corrected for attenuation. 
Besides, a test which is pure and elemental in one sample may be 
impure and complex in another. O’Connor’s work fails to give much 
ground for immediate hope in the possibility of empirical measurement 
of psychological “elements.” 

The tabular presentation of the empirical data of the book is 
exceptionally complete. Practically no discussion is offered, however, 
of the relevant bibliography. H. S. Conran. 

University of California. 


H. C. Miuuts, M. E. Wacner, R. E. Eckert anp M. E. SArRBauGu. 
Studies in Articulation of High School and College. Buffalo: 
University of Buffalo Studies, Vol. IX, 1934, pp. XIII + 319. 


This volume, edited by Professor E. S. Jones, presents new data 
on a perennial problem in higher education. Although important 
steps have been taken toward improving the articulation between 
high school and college, such articulation is at present far from satis- 
factory. The group of studies in this monograph is concerned with 
three factors which condition in an important manner any program 
designed to facilitate the transition from high school to college: (1) 
Nature of the superior student; (2) prediction of college performance, 
and (3) measurement of overlapping between high school and college. 

The recent trend toward giving greater recognition to the superior 
student has received an added emphasis at the University of Buffalo. 
No unfavorable results appeared from encouraging superior students 
to accelerate their progress by anticipating some of their college work 
while still in high school and to finish the high school—college work in 
seven years. 

As has been found in other investigations, high school (Regents) 
grades were more satisfactory than other measures for predicting 
success in college. The relatively high relationship between Regents 
grades in languages and college achievement is ‘‘in all probability an 
indication of the extent to which the language element enters into all 
high school and college work.” 
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Much overlapping was found between high school and college 
courses, especially in physics, chemistry, economics and American 
History. The superior high school students were able in many cases 
to anticipate satisfactorily through independent study the college work 
in these overlapping courses. 

All those interested in higher education will read with profit this 
excellent contribution which reports only a first step toward improved 
articulation at the University of Buffalo. Publications on future 
progress will be welcomed. Mixes A. TINKER. 

University of Minnesota. 


Epna E.. Kramer. A First Course in Educational Statistics. New 
York: John Wiley and Sons, 1935, pp. VI + 212. 


This book is designed to be used as a text in a one-semester course 
for undergraduates. Within the limitations of this purpose it gives a 
reasonably good treatment. It is definitely shorter than most other 
books in its field. The shortening is accomplished largely by omission 
rather than by compression. The material retained is nearly all 
essential, and most of the omissions are reasonable. But in the opinion 
of the reviewer the absence of any treatment of partial and multiple 
correlation must remain a serious defect. It is to be hoped that in 
future editions the author will find some way to include this mate- 
rial, even if necessary at the expense of omitting other topics or of 
condensation. 

The book has many features of definite merit. Perhaps the most 
important of these is its use of a wide range of real educational data 
in the majority of its numerous and excellent exercises. Another 
important advantage lies in its correct treatment of the difficult 
matter of step-intervals. The discussions of interpolation, the 
relative values of exact and approximate methods of computation, 
and the distinction between frequency curves and probability curves 
are all above average in clarity. 

There are also a number of definite faults. Tables are often placed 
a page or two ahead of or behind the references to them. In regard 
to the standard deviation, ‘“‘ ... the square root... is taken, 
in order to undo, somewhat, the result of squaring the deviations 
originally” (pp. 39-40), rather than to obtain a linear or first degree 
measure. ‘‘In order to obtain a square root to two places, it was 
necessary to keep four places under the radical sign” (p. 43, italics 
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in the text), instead of the usual two or at most three. The coefficient 
of variation is described without any caution regarding its use with 
test scores having indeterminate zero-points. ‘‘Statistical methods 
can determine the sampling error in the mean, namely, the difference 
between the true mean and the mean obtained from the sample” 
(p. 111, italics in the text), a loose statement. So much emphasis 
is placed on +3o that the student might easily get the impression that 
the normal curve actually ends at these points. The sampling dis- 
tribution of the correlation coefficient is treated as if it were normal for 
all large or fairly large samples, regardless of the value of r. The 
standard error of estimate is treated as if it were a sampling error. 
And finally, in the last chapter, the author resorts definitely to com- 
pression, with a consequent loss in readability. 

If this list of faults is imposing, it is neither exhaustive nor unusual. 
Many of these same errors, as well as many others not present in this 
text, appear in most of the other books on educational statistics 
now on the market. In spite of them, this book is on the whole 
superior, for the purpose for which it was written, to most if not all 
of the other available texts. Epwarp E. CURETON. 

Alabama Polytechnic Institute. 


E. R. Guturie. The Psychology of Learning. New York, Harper & 
Brothers, 1935, pp. VIII + 258. 


I burst into loud guffaws of laughter as I read this book. I there- 
fore commend it heartily to the readers of the Journal of Educational 
Psychology, for very few of them, I feel sure, ever got much amuse- 
ment out of their reading. 

Yet Guthrie’s book must be regarded as a flippant treatise. On 
the contrary it contains a closely argued thesis written, most fortu- 
nately, in a bright and compelling style. In it will be found a summary 
and discussion of much of the psychological material on learning. 
Most of the work referred to is drawn from the field of animal learning, 
and much of it centers around Pavlov’s work. 

The book would have been improved by the inclusion of more 
material on human learning and by the discussion of Thorndike’s 
acute analysis of conditioning which is found in his ‘Fundamentals 
of Learning.”’ 

Guthrie reserves the word learning for the more lasting effects 
of practice. He maintains, in a vigorous defence of a somewhat 
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discredited associationism, that ‘‘it is the time relations between 
the substitute stimulus and the response that count. It is not the 
two stimuli that are associated, but the substitute cue and the resultant 
act.” This is Guthrie’s main contribution. It probably is a very 


important one. PETER SANDIFORD. 
University of Toronto. 


Ira S. Witz. HH andedness: Right and Left. Boston: Lothrop, Lee 
. and Shepard Co., 1934, pp. XIII + 439. 


Why are there more right-handed people than left-handed? Has 
the dominance of the right hand always been as extensive as it is 
today? Do animals show handedness? Is handedness inherited; 
is it due to differentiated blood supply to the two hemisphere; is it 
merely a product of social requirements? Such questions as these 
have been answered in many different ways, by many different 
writers over a long period of time. And the end is not yet. 

} Dr. Wile has written an excellent summary and integration of 
this troublesome field. The extent of his scholarship is indicated in 
only the most formal way by his nine hundred references. Successive 
chapters discuss the anthropology, philology, prevalence, causation 
(including psychological, physiological, neurological, social and cosmic 
theories) of handedness; the relations of handedness in folklore, 
magic and religion; and finally, the part played by handedness in the 
organization and development of personality. Thus this work removes 
the phenomenon of handedness from the narrow confines of psycho- 
physiology, to the extensity of our cultural history. To exhibit a 
scholarly grasp of all the many relationships, as Dr. Wile has done, is 
no mean achievement. C. M. Loortir. 
Indiana University. 
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